Measuring the volume of information that the users, deliberately or not, leave on-line is an impossible mission. The vast majority of the actions performed by the users enclose more information than what the users themselves think they are producing.
To shed light on this truth, the talk will start with different examples of implicit information, namely traces that are often hidden inside other explicit feedbacks, or directly detectable by the users' actions. The focus will then move to browsing behavior analysis, including approaches to get a deeper understanding of the users, in particular cold-start situations. The talk will conclude showing how to follow these users traces to obtain reliable knowledge about the content consumed by the end-users.
10. Understanding the Users
structured
un-structured
semi-structured
implicit data
tough data, but very common
(it is often hard to understand)
user generated content
good data, common and with lot of
knowledge, but often difficult to use
explicit feedback
optimal data, but very rare
(limited in applications/items/attributes)
11. explicit, clear and structuredstructured
understanding the users from explicit feedback
12. explicit, clear and structuredstructured
understanding the users from explicit feedback
13. explicit, clear and structuredstructured
understanding the users from explicit feedback
14. explicit, clear and structuredstructured
understanding the users from explicit feedback
15. explicit, clear and structuredstructured
understanding the users from explicit feedback
16. implicit, noisy and unstructuredun-structured
understanding the users from implicit feedback
17. implicit, noisy and unstructuredun-structured
understanding the users from implicit feedback
navigational patterns
user behavior
18. implicit, noisy and unstructuredun-structured
understanding the users from implicit feedback
navigational patterns
user behavior
item importance
content recommendation
19. implicit, noisy and unstructuredun-structured
understanding the users from implicit feedback
navigational patterns
user behavior
item importance
content recommendation
browsing graph
referrer graph
21. user generated content
“ the users are always
leaving information behind them ”
understanding the users from their content
semi-structured
22. user generated content
reviews / opinions
comments
media
(images / visual content)
meta-data (gps, tags, ..)
interests + social
tweets / vine videos
“ the users are always
leaving information behind them ”
understanding the users from their content
semi-structured
23. “ the users are always
leaving information behind them ”
semi-structured
Understanding the users from their content
beyond the scope of their action
24. “ the users are always
leaving information behind them ”
semi-structured
Understanding the users from their content
beyond the scope of their action
25. “ the users are always
leaving
Loud and Trendy: Crowdsourcing
Impressions of Social Ambiance in Popular
Indoor Urban Places, CH’15
semi-structured
Understanding the users from their content
beyond the scope of their action
27. 5 stars rating explicit information
clear and easy to
understand
“connecting people with great local businesses”
28. 5 stars rating explicit information
clear and easy to
understand
unstructured and noisy
contains extremely meaningful information
“connecting people with great local businesses”
30. identify the “food words” inside the review
understand the user’s opinion
Understand User’s Taste
31. identify the “food words” inside the review
understand the user’s opinion
Understand User’s Taste
build a user taste profile
32. identify the “food words” inside the review
understand the user’s opinion
Understand User’s Taste
build a user taste profile build a restaurant
“kitchen quality” profile
38. user taste
profile
restaurant
kitchen quality
profile
user visits
a new place
what the user likes
the “specialities” of the
restaurant: serendipity?
food or menu
recommendation
“Buon Appetito - Recommending Personalized Menus”, HT’14
44. Recommendation Experiment.
[avg-sent]
most frequent positive food items among the profiles (> threshold)
[user-words]
user-based CF with weighted items by positive sentiments
[menu-words]
frequent and good menu/item sets (Fuzzy Apriori)
[zero-sent]
most frequent food items among the profiles (no sentiments)
Food/Menu Recommender
45. Recommendation Experiment.
[avg-sent]
most frequent positive food items among the profiles (> threshold)
[user-words]
user-based CF with weighted items by positive sentiments
[menu-words]
frequent and good menu/item sets (Fuzzy Apriori)
[zero-sent]
most frequent food items among the profiles (no sentiments)
Food/Menu Recommender
56. User Browsing Graph
collect all browsing sessions
BrowseGraph
(wighted graph)
“BrowseRank: letting web users vote for page importance”, SIGIR’08
“Image Ranking Based on Users Browsing Behavior”, SIGIR’12
59. User Browsing Graph
“Discovering Social Photo Navigation Patterns”, ICME’12
identifying from where
users are entering the
website
capture users’ interest
(collecting user’s
browsing patterns)
60. User Browsing Graph
“Discovering Social Photo Navigation Patterns”, ICME’12
identifying from where
users are entering the
website
(external) referrer URL
61. User Browsing Graph
“Discovering Social Photo Navigation Patterns”, ICME’12
identifying from where
users are entering the
website
(external) referrer URL
Does the referrer URL
tell us something about
the user’s session?
65. User Browsing Graph
mail
search
engine
blogs
social
network
labeling referrer URLs (top domains)
“Discovering Social Photo Navigation Patterns”, ICME’12
Does the referrer URL tell us something about the user’s session?
classify Flickr web pages (photos, groups, profile, …)
66. The Predictive Power
of the Referrer Domain
sample of 2 months
of Flickr logs
Apache Web Logs
<user_id,
timestamp,
referrer_url,
current_url,
user_agent>
~300M page views
~40M user sessions
~10M unique users
Flickr Data
72. Visitors behave differently depending on where they
come from
Users tend to perform similar sessions when coming
from the same referrer class (domain)
Note: referrer URL comes for free!
The Predictive Power
of the Referrer Domain
“Discovering Social Photo Navigation Patterns”, ICME’12
73. Visitors behave differently depending on where they
come from
Users tend to perform similar sessions when coming
from the same referrer class (domain)
Note: referrer URL comes for free!
The Predictive Power
of the Referrer Domain
“Discovering Social Photo Navigation Patterns”, ICME’12
What kind of knowledge
the referrer URL adds
within the BrowseGraph ?
75. User Browsing Graph
2 months of logs
~300M page views
~40M user sessions
~10M unique users
Flickr Data
76. User Browsing Graph
2 months of logs
~300M page views
~40M user sessions
~10M unique users
Flickr Data
considering
only photo
web page
77. User Browsing Graph
BrowseGraph
(wighted graph)
ranking of photos based
on browsing behavior
“Image Ranking Based on Users Browsing Behavior”, SIGIR’12
2 months of logs
~300M page views
~40M user sessions
~10M unique users
Flickr Data
considering
only photo
web page
86. Photo Ranking
Through the Browse Graph
Evaluation
internal popularity — how popular is the photo within Flickr?
87. Photo Ranking
Through the Browse Graph
Evaluation
internal popularity — how popular is the photo within Flickr?
collective attention — implicit visibility of the photo
88. Photo Ranking
Through the Browse Graph
Evaluation
internal popularity — how popular is the photo within Flickr?
collective attention — implicit visibility of the photo
external popularity — how popular is the photo outside Flickr?
89. Photo Ranking
Through the Browse Graph
Evaluation
internal popularity — how popular is the photo within Flickr?
collective attention — implicit visibility of the photo
external popularity — how popular is the photo outside Flickr?
diversity — how diverse is the ranking?
90. Photo Ranking
Through the Browse Graph
Internal Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
91. Photo Ranking
Through the Browse Graph
Internal Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
92. Photo Ranking
Through the Browse Graph
Internal Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
Favorites : ranks
images with highest
internal engagement
legend
93. Photo Ranking
Through the Browse Graph
Collective Attention
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
94. Photo Ranking
Through the Browse Graph
Collective Attention
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
95. Photo Ranking
Through the Browse Graph
Collective Attention
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
Favorites and Views
are not very
correlated
97. Photo Ranking
Through the Browse Graph
External Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
98. Photo Ranking
Through the Browse Graph
External Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
99. Photo Ranking
Through the Browse Graph
External Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
legend
100. Photo Ranking
Through the Browse Graph
External Popularity
x-axis: top N results ([1,1000] images)
y-axis: cumulative value of the features
Favorites : low correlation
with external visibility
Page/BrowseRank : very
high correlation —thanks to
the referrer?
108. About the Referrer URL :
information about the session the user is going to do
understanding how the webpages are linked from
the external world
Recap
Analysis of the Browsing Logs
109. About the Referrer URL :
information about the session the user is going to do
understanding how the webpages are linked from
the external world
Recap
Analysis of the Browsing Logs
About the BrowseGraph :
discovering content “voted” by the users
extending the informativeness with the Referrer URL
110. About the Referrer URL :
information about the session the user is going to do
understanding how the webpages are linked from
the external world
Recap
Analysis of the Browsing Logs
About the BrowseGraph :
discovering content “voted” by the users
extending the informativeness with the Referrer URL
111. About the Referrer URL :
information about the session the user is going to do
understanding how the webpages are linked from
the external world
Recap
Analysis of the Browsing Logs
About the BrowseGraph :
discovering content “voted” by the users
extending the informativeness with the Referrer URL
Can we predict the content
the user is going to consume?
112. Can we predict the content
the user is going to consume?
113. Can we predict the content
the user is going to consume?
un-structured
114. Can we predict the content
the user is going to consume?
un-structured
implicit information
(navigational patterns)
115. Can we predict the content
the user is going to consume?
un-structured
implicit information
(navigational patterns)
browsing graph
(referrer graph)
116. Can we predict the content
the user is going to consume?
un-structured
implicit information
(navigational patterns)
prediction / recommendation
browsing graph
(referrer graph)
126. Browse Graph on News
Predicting News Articles Consumption
Yahoo News
BrowseGraph
~500M pageviews
Social Network Search Engine
127. Browse Graph on News
Predicting News Articles Consumption
Yahoo News
BrowseGraph
~500M pageviews
Social Network Search Engine
128. “Cold-start News Recommendation with Domain-dependent Browse Graph”, RecSys’14
Browse Graph on News
Predicting News Articles Consumption
Yahoo News
BrowseGraph
~500M pageviews
Social Network Search Engine
Domain-Dependent
BrowseGraph
..or just referrerGraph.
129. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
hypothesis : news articles consumed are
differentiable by the referrer domains
130. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
hypothesis : news articles consumed are
differentiable by the referrer domains
implement and evaluate a
recommender system based on
the referrerGraphs
132. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
sessions are very short
average number of hops
during browsing sessions
133. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
sessions are very short
average number of hops
during browsing sessions
134. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
sessions are very short
average number of hops
during browsing sessions
very different size
135. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
sessions are very short
average number of hops
during browsing sessions
very different size well connected
136. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Nodes Overlap and Importance
137. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Nodes Overlap and Importance
homepage
google
yahoo
bing
facebook
twitter
reddit
homepage
google
yahoo
bing
facebook
twitter
reddit
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jaccard Similarity of
Node Sets
138. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Nodes Overlap and Importance
homepage
google
yahoo
bing
facebook
twitter
reddit
homepage
google
yahoo
bing
facebook
twitter
reddit
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jaccard Similarity of
Node Sets
139. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Nodes Overlap and Importance
homepage
google
yahoo
bing
facebook
twitter
reddit
homepage
google
yahoo
bing
facebook
twitter
reddit
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jaccard Similarity of
Node Sets
homepage
google
yahoo
bing
facebook
twitter
reddit
homepage
google
yahoo
bing
facebook
twitter
reddit
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Kendall Between
News PageRanks
⌧
140. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Most Common Categories
141. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Most Common Categories
142. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Most Common Categories
143. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
Most Common Categories
145. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
hypothesis : news articles consumed are
differentiable by the referrer domains
146. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
hypothesis : news articles consumed are
differentiable by the referrer domains
different graph structure
different interest of the users:
individual articles (node)
news articles topics
importance (PageRank ranking)
165. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
About the ReferrerGraph :
166. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
About the ReferrerGraph :
prediction information of the referrer URL +
collective behaviors of the users
167. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
About the ReferrerGraph :
prediction information of the referrer URL +
collective behaviors of the users
able to capture interest of users —even for
cold-start problem
168. Browse Graph on News
Predicting News Articles ConsumptionYahoo News
BrowseGraph
About the ReferrerGraph :
prediction information of the referrer URL +
collective behaviors of the users
able to capture interest of users —even for
cold-start problem
extremely powerful in the news context
175. User interactions
Future Work
Extending Implicit Signals
location data (IP Address, Mobile GPS)
device type (tablet vs. mobile vs. desktop)
custom webpage data (Social Media, …)
176. User interactions
Future Work
Extending Implicit Signals
location data (IP Address, Mobile GPS)
device type (tablet vs. mobile vs. desktop)
custom webpage data (Social Media, …)
Integrating User Profile
long term user information
user’s profile changes over time
(with respect to the referrer?)
177. User interactions
Future Work
Extending Implicit Signals
location data (IP Address, Mobile GPS)
device type (tablet vs. mobile vs. desktop)
custom webpage data (Social Media, …)
Integrating User Profile
long term user information
user’s profile changes over time
(with respect to the referrer?)
Experiment Different Graphs
graph of actions instead of pageviews?
(share actions, explicit activity, ads, …)