SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
A	
  Journey	
  into	
  Evalua0on:	
  
from	
  Retrieval	
  Effec0veness	
  to	
  
User	
  Engagement	
  
Mounia Lalmas
Yahoo Labs London
mounia@acm.org
SPIRE 2015 – King’s College London
This talk
§ Introduction to user engagement
§ Evaluation in information retrieval
(retrieval effectiveness)
§ From retrieval effectiveness to user engagement
(from intra-session to inter-session evaluation)
(from small- to large-scale evaluation)
This talk
beyond the click
beyond relevance
towards user engagement
User engagement
What is user engagement?
“User engagement is a quality of the user
experience that emphasizes the phenomena
associated with wanting to use a technological
resource longer and frequently” (Attfield et al, 2011)
self-report: happy, sad,
enjoyment, …
emotional, cognitive and behavioural connection
that exists, at any point in time and over time, between
a user and a technological resource
analytics: click, upload,
read, comment, share …
physiology: gaze, body heat,
mouse movement, …
Why is it important to engage users?
§  In today’s wired world, users have enhanced expectations
about their interactions with technology
… resulting in increased competition amongst the
purveyors and designers of interactive systems.
§  In addition to utilitarian factors, such as usability, we must
consider the hedonic and experiential factors of interacting
with technology, such as fun, fulfillment, play, and user
engagement.
(O’Brien, Lalmas & Yom-Tov, 2014)
Online sites differ with respect to
their engagement pattern
Games
Users spend
much time per
visit
Search
Users come
frequently and
do not stay long
Social media
Users come
frequently and
stay long
Niche
Users come on
average once
a week e.g. weekly
post
News
Users come
periodically,
e.g. morning and
evening
Service
Users visit site,
when needed,
e.g. to renew
subscription
(Lehmann etal, 2012)
Characteristics of user engagement
Novelty
(Webster & Ho, 1997; O’Brien,
2008)
Richness and control
(Jacques et al, 1995; Webster &
Ho, 1997)
Aesthetics
(Jacques et al, 1995; O’Brien,
2008)
Endurability
(Read, MacFarlane, & Casey,
2002; O’Brien, 2008)
Focused attention
(Webster & Ho, 1997; O’Brien,
2008)
Reputation, trust and
expectation (Attfield et al,
2011)
Positive Affect
(O’Brien & Toms, 2008)
Motivation, interests,
incentives, and benefits
(Jacques et al., 1995; O’Brien & Toms,
2008)
(O’Brien, Lalmas & Yom-Tov, 2014)
Measuring user engagement
Measures	
   Attributes	
  
Self-report Questionnaire, interview,
think-aloud and think after
protocols
Subjective
Short- and long-term
Lab and field
Small scale
Physiology EEG, SCL, fMRI
eye tracking
mouse-tracking
Objective
Short-term
Lab and field
Small and large scale
Analytics within- and across-session
metrics
data science
Objective
Short- and long-term
Field
Large scale
Attributes of user engagement
§ Scale (small versus large)
§ Setting (laboratory versus field)
§ Objective versus subjective
§ Temporality (short- versus long-term)
We focus on
1.  Temporality: from intra- to inter-session
2.  Scalability: from small- to large-scale
Evaluation in
information
retrieval
How to evaluate a search engine
§ Coverage	
  
§ Speed	
  
§ Query	
  language	
  
§ User	
  interface	
  
§ User	
  happiness	
  
›  Users	
  find	
  what	
  they	
  want	
  and	
  return	
  to	
  the	
  search	
  engine	
  
›  Users	
  complete	
  the	
  search	
  task,	
  where	
  search	
  is	
  a	
  means,	
  not	
  
an	
  end	
  
Sec. 8.6
(Manning, Raghavan & Schütze, 2008; Baeza-Yates & Ribeiro-Neto, 2011)
Within an online
session
›  July 2012
›  2.5M users
›  785M page views
›  Categorization of the most
frequent accessed sites
•  11 categories (e.g. news), 33
subcategories (e.g. news finance,
news society)
•  760 sites from 70 countries/regions
short sessions: average 3.01 distinct sites visited with revisitation rate 10%
long sessions: average 9.62 distinct sites visited with revisitation rate 22%
(Lehmann etal, 2013)
Measuring user happiness
Most	
  common	
  proxy:	
  relevance	
  of	
  search	
  results	
  
Sec. 8.1
Relevant
Retrieved
all items
§  User	
  informa)on	
  need	
  translated	
  into	
  
a	
  query	
  
§  Relevance	
  assessed	
  rela0ve	
  to	
  	
  
informa)on	
  need	
  not	
  the	
  query	
  
§  Example:	
  
›  Informa0on	
  need:	
  I	
  am	
  looking	
  for	
  tennis	
  
holiday	
  in	
  a	
  country	
  with	
  no	
  rain	
  
›  Query:	
  tennis	
  academy	
  good	
  weather	
  
Evaluation measures:
•  precision, recall, R-precision; precision@n;
mean average precision; F-measure; …
•  bpref; cumulative gains, …
precision
recall
Measuring user happiness
Most	
  common	
  proxy:	
  relevance	
  of	
  search	
  result	
  
Sec. 8.1
Explicit signals
Test collection methodology (TREC, CLEF, …)
Human labeled corpora
Implicit signals
User behavior in online settings (clicks, skips, …)
Examples of implicit signals in web
search
§  Number of clicks
§  Click at given position
§  Time to first click
§  Skipping
§  Abandonment rate
§  Number of query reformulations
§  Dwell time
What is a happy user in web search
1.  The user information need is satisfied
2.  The user has learned about a topic and even
about other topics
3.  The system was inviting and even fun to use
In-the-moment engagement
Users active on a site or stayed long
Long-term engagement
Users come back frequently and
over a long-term period
USER ENGAGEMENT
Interpreting the
signals
Click-through rates
CTR
new ranking algorithm
new design of search result page
…
I just wanted the phone number … I am totally happy J
No clicks
Dwell time
DWELL TIME
used a proxy of
user experience
Publisher
click on
an ad on
mobile
device
Dwell time on non-optimized landing pages
comparable and even higher than on mobile-
optimized ones
… when mobile optimized, users realize quickly
whether they “like” the ad or not?
(Lalmas etal, 2015)
non-mobile optimized mobile optimized
Multimedia search
activities often
driven by
entertainment
needs, not by
information needs
Relevance in multimedia search
(Slaney, 2011)
Explorative or serendipitous search
(Miliaraki, Blanco & Lalmas, 2015)
top most popular tweets top most popular tweets + geographical diverse
Being from a central or peripheral location makes a difference.
Peripheral users did not perceive the timeline as being diverse
Objectivity versus subjectivity
It should never be just about the algorithm, but also how users respond to what the
algorithm returns to them à USER ENGAGEMENT
(Eduardo Graells, 2015)
Let us revisit
Interactive Information Retrieval
(Ingwersen, Human Aspects in IR, ESSIR 2011)
USERENGAGEMENT
Beyond clicks and
relevance towards
user engagement
§ From intra- to inter-session evaluation
›  Dwell time and absence time
›  Linking strategy
›  Mobile advertising
§ From small- to large-scale evaluation
›  Eye-tracking and user engagement questionnaire
›  Mouse tracking and user engagement questionnaire
happy users
come back
we need to
properly
identify the
happy users
From intra- to
inter-session
evaluation
From short- to long-term engagement:
From intra- to inter-session engagement
intra-session
metric(s)
inter-session
metric(s)
how users engage within
a session?
how users engage across
sessions?
We monitor We know what it will mean
futureengagement
proxy
User engagement metrics
intra-session metrics
•  Dwell time
•  Session duration
•  Bounce rate
•  Play time (video)
•  Mouse movement
•  Click through rate (CTR)
•  Number of pages
viewed (click depth)
•  Conversion rate
•  Number of UCG
(comments)
•  …
Dwell time as a proxy of user interest
Dwell time as a proxy of relevance
Dwell time as a proxy of conversion
Dwell time as a proxy of post-click ad
quality
…
User engagement metrics
intra-session
inter-session
Dwell time
§ Definition
The contiguous time spent on
a site or web page
§ Similar measures
Play time (for video sites)
§ Cons
Not clear that the user was
actually looking at the site
while there à blur/focus
Distribution of dwell times on 50
websites
(O’Brien, Lalmas & Yom-Tov, 2014)
Dwell time
Dwell time varies by
site type:
•  leisure sites tend to have
longer dwell times than
news, e-commerce, etc.
Dwell time has a
relatively large variance
even for the same site
Dwell time on 50 websites
(tourists, VIP, active …
users)
(O’Brien, Lalmas & Yom-Tov, 2014)
Dwell time across sessions
or absence time
The context – search experience
The context – search experience
Absence time and survival analysis
story 1
story 2
story 3
story 4
story 5
story 6
story 7
story 8
story 9
0 5 10 15 20
0.00.20.40.60.81.0
Users (%) who did come back
Users (%) who read story 2 but did not come back after 10 hours
SURVIVE
DIE
DIE = RETURN TO SITE èSHORT ABSENCE TIME
hours
Absence time applied to search
Ranking function on Yahoo Answer Japan
Two-weeks click data on Yahoo Answer Japan: search
One millions users
Six ranking functions
30-minute session boundary
survival analysis: high hazard rate (die quickly) = short absence
5 clicks
control=noclick
Absence time and number of clicks on
search result page
3 clicks
Absence time – search experience
1.  No click means a bad user experience
2.  Clicking between 3-5 results leads to same user experience
3.  Clicking on more than 5 results reflects poorer user experience;
users cannot find what they are looking for
4.  Clicking lower in the ranking (2nd, 3rd) suggests more careful choice
from the user (compared to 1st)
5.  Clicking at bottom is a sign of low quality overall ranking
6.  Users finding their answers quickly (time to 1st click) return sooner to
the search application
7.  Returning to the same search result page is a worse user experience
than reformulating the query
search session metrics à absence time
(Dupret & Lalmas, 2013)
Others
Related	
  off-­‐site	
  content	
  
The context – Linking strategy in online
news
News provider
p(absence12h)
No Click Off-site click
Off-site link à absence time
Providing links to related off-site
content has a positive long-term
effect
(Lehmann etal, In Progress)
The Context –
Mobile advertising
0%
200%
400%
600%
short ad clicks long ad clicks
adclickdifference
Dwell time à ad click
Positive post-click
experience (“long” clicks)
has an effect on users
clicking on ads again
(Lalmas etal, 2015)
Beyond clicks and
relevance towards
user engagement
§ From intra- to inter-session evaluation
›  Dwell time and absence time
›  Linking strategy
›  Mobile advertising
happy users
come back
From small- to
large-scale
evaluation
Small scale measurement – focused
attention questionnaire
5-point scale (strong disagree to strong agree)
1.  I lost myself in this news tasks experience
2.  I was so involved in my news tasks that I lost track of time
3.  I blocked things out around me when I was completing the news tasks
4.  When I was performing these news tasks, I lost track of the world
around me
5.  The time I spent performing these news tasks just slipped away
6.  I was absorbed in my news tasks
7.  During the news tasks experience I let myself go
(O'Brien & Toms, 2010)
Small scale measurement – PANAS
questionnaire
(10 positive items and 10 negative items)
§  You feel this way right now, that is, at the present moment
[1 = very slightly or not at all; 2 = a little; 3 = moderately;
4 = quite a bit; 5 = extremely]
[randomize items]
distressed, upset, guilty, scared, hostile,
irritable, ashamed, nervous, jittery, afraid
interested, excited, strong, enthusiastic, proud,
alert, inspired, determined, attentive, active
(Watson, Clark & Tellegen, 1988)
Small scale measurement – gaze and
self-reporting
News
interest
57 users
reading task (114)
•  questionnaire (qualitative data)
•  record eye tracking
•  (quantitative data)
Three metrics: gaze,
focus attention and
positive affect
All three metrics align:
interesting content promote
all engagement metrics
(Arapakis etal, 2014)
From small- to large-scale
measurement – mouse tracking
§  Navigation & interaction with digital
environment usually involves the use
of a mouse (selecting, positioning, clicking)
§  Several works show mouse cursor as
weak proxy of gaze (attention)
§  Low-cost, scalable alternative
§  Can be performed in a non-invasive
manner, without removing users from
their natural setting
Relevance, dwell time & cursor
“reading” a relevant long document vs “scanning” a long non-relevant
document
(Guo & Agichtein, 2012)
“Ugly”vs“Normal”Interface
BBC News
Wikipedia
Mouse tracking and self-reporting
§  324 users from Amazon Mechanical Turk (between
subject design)
§  Two tasks (reading and search)
§  “Normal vs Ugly” interface
§  Questionnaires (qualitative data)
›  focus attention, positive effect
›  interest, aesthetics
§  Mouse tracking (quantitative data)
›  movement speed, movement rate, click rate, pause length, percentage of time
still
(Warnock & Lalmas, 2015)
Mouse tracking could not tell much
about
•  focused attention and positive affect
•  user interests in the task/topic
•  aesthetics
BUT BUT BUT BUT
›  “ugly” variant did not result in lower USER aesthetics scores
›  although BBC > Wikipedia
BUT – the comments left …
›  Wikipedia: “The website was simply awful. Ads flashing everywhere, poor
text colors on a dark blue background.”; “The webpage was entirely blue. I don't
know if it was supposed to be like that, but it definitely detracted from the
browsing experience.”
›  BBC News: “The website's layout and color scheme were a bitch to
navigate and read.”; “Comic sans is a horrible font.”
Flawed methodology? Non-existing
signal? Wrong metric? Wrong measure?
§ Hawthorne Effect
§ Design
›  Usability versus engagement
›  Within- versus between-subject
§ Mouse movement was not sophisticated enough
Mouse Gestures
à Features
x0y0
x1y1
x2y2
x3y3 x4y4
x5y5
x6y6
x7y7
x8y8
t
Δt rest Δt rest
resting cursor
(500ms)
resting cursor
(1000ms)
resting cursor
(1500ms)
click
40006000
y
●●
●
●●●●●●●●●●●
●●●
(Arapakis, Lalmas & Valkanas, 2014)
22 users reading two articles
176,550 cursor positions
2,913 mouse gestures
Towards a taxonomy of mouse gestures
for user engagement measurement
§  The top-ranked clustering configuration is the Spectral Clustering
for the original dataset, with hyperbolic tangent kernel, for k = 38
•  certain types of mouse gestures occur more or less often, depending on user
interest in article
•  significant correlations between certain types of mouse gestures and self-
report measures
•  cursor behaviour goes beyond measuring frustration
•  inform about the positive and negative interaction
Beyond clicks and
relevance towards
user engagement
§ From small- to large-scale evaluation
›  Eye-tracking and user engagement questionnaire
›  Mouse tracking and user engagement questionnaire
we need to
properly identify
the happy users
Towards user
engagement
Towards User Engagement
happy users
come back
we need to
properly identify
the happy users
§  “If you cannot measure it,
you cannot improve it” William
Thomson (Lord Kelvin)
§  “You cannot control what you
cannot measure” DeMarco
§  “The way you measure is
more important than what
you measure” Art Gust
Thank you

Más contenido relacionado

La actualidad más candente

An Engaging Click ... or how can user engagement measurement inform web searc...
An Engaging Click ... or how can user engagement measurement inform web searc...An Engaging Click ... or how can user engagement measurement inform web searc...
An Engaging Click ... or how can user engagement measurement inform web searc...Mounia Lalmas-Roelleke
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementMounia Lalmas-Roelleke
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersMounia Lalmas-Roelleke
 
Measuring user engagement: the do, the do not do, and the we do not know
Measuring user engagement: the do, the do not do, and the we do not knowMeasuring user engagement: the do, the do not do, and the we do not know
Measuring user engagement: the do, the do not do, and the we do not knowMounia Lalmas-Roelleke
 
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersPromoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersMounia Lalmas-Roelleke
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationMounia Lalmas-Roelleke
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at SpotifyOguz Semerci
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisMounia Lalmas-Roelleke
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
Optimize Social Listening and Monitoring to Uncover Unmet Consumer Needs
Optimize Social Listening and Monitoring to Uncover Unmet Consumer NeedsOptimize Social Listening and Monitoring to Uncover Unmet Consumer Needs
Optimize Social Listening and Monitoring to Uncover Unmet Consumer NeedsInspire
 
Engaging users in digital strategy development
Engaging users in digital strategy developmentEngaging users in digital strategy development
Engaging users in digital strategy developmentEndeavor Management
 
Cause We Care - Design Thinking Project Final Report
Cause We Care - Design Thinking Project Final ReportCause We Care - Design Thinking Project Final Report
Cause We Care - Design Thinking Project Final ReportMatthew Stuckings
 
Influence of word of mouth communication towards indonesian online shopper pu...
Influence of word of mouth communication towards indonesian online shopper pu...Influence of word of mouth communication towards indonesian online shopper pu...
Influence of word of mouth communication towards indonesian online shopper pu...Eka Yuliana
 
Influence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionInfluence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionNasif Chowdhury
 

La actualidad más candente (20)

An Engaging Click ... or how can user engagement measurement inform web searc...
An Engaging Click ... or how can user engagement measurement inform web searc...An Engaging Click ... or how can user engagement measurement inform web searc...
An Engaging Click ... or how can user engagement measurement inform web searc...
 
Story-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User EngagementStory-focused Reading in Online News and its Potential for User Engagement
Story-focused Reading in Online News and its Potential for User Engagement
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
 
Measuring user engagement: the do, the do not do, and the we do not know
Measuring user engagement: the do, the do not do, and the we do not knowMeasuring user engagement: the do, the do not do, and the we do not know
Measuring user engagement: the do, the do not do, and the we do not know
 
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini UsersPromoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
Promoting Positive Post-click Experience for In-Stream Yahoo Gemini Users
 
Tutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and OptimizationTutorial on Online User Engagement: Metrics and Optimization
Tutorial on Online User Engagement: Metrics and Optimization
 
Measuring User Engagement
Measuring User EngagementMeasuring User Engagement
Measuring User Engagement
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Improving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival AnalysisImproving Post-Click User Engagement on Native Ads via Survival Analysis
Improving Post-Click User Engagement on Native Ads via Survival Analysis
 
Advertising Quality Science
Advertising Quality ScienceAdvertising Quality Science
Advertising Quality Science
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Exploring the roles of hosts' attachment and psychological ownership in an Ai...
Exploring the roles of hosts' attachment and psychological ownership in an Ai...Exploring the roles of hosts' attachment and psychological ownership in an Ai...
Exploring the roles of hosts' attachment and psychological ownership in an Ai...
 
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
Chinese Adoption of Travel Information on Social Media: Moderating Effects of...
 
Optimize Social Listening and Monitoring to Uncover Unmet Consumer Needs
Optimize Social Listening and Monitoring to Uncover Unmet Consumer NeedsOptimize Social Listening and Monitoring to Uncover Unmet Consumer Needs
Optimize Social Listening and Monitoring to Uncover Unmet Consumer Needs
 
Engaging users in digital strategy development
Engaging users in digital strategy developmentEngaging users in digital strategy development
Engaging users in digital strategy development
 
Cause We Care - Design Thinking Project Final Report
Cause We Care - Design Thinking Project Final ReportCause We Care - Design Thinking Project Final Report
Cause We Care - Design Thinking Project Final Report
 
Influence of word of mouth communication towards indonesian online shopper pu...
Influence of word of mouth communication towards indonesian online shopper pu...Influence of word of mouth communication towards indonesian online shopper pu...
Influence of word of mouth communication towards indonesian online shopper pu...
 
Which is More Important in Online Review Usefulness, Heuristic or Systematic ...
Which is More Important in Online Review Usefulness, Heuristic or Systematic ...Which is More Important in Online Review Usefulness, Heuristic or Systematic ...
Which is More Important in Online Review Usefulness, Heuristic or Systematic ...
 
Influence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionInfluence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase Intention
 

Similar a A Journey into Evaluation: from Retrieval Effectiveness to User Engagement

To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?Mounia Lalmas-Roelleke
 
Optimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesOptimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesUserZoom
 
Www tutorial2013 userengagement
Www tutorial2013 userengagementWww tutorial2013 userengagement
Www tutorial2013 userengagementGabriela Agustini
 
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...IRJET Journal
 
Situational analysis of the subjective well-being of university software deve...
Situational analysis of the subjective well-being of university software deve...Situational analysis of the subjective well-being of university software deve...
Situational analysis of the subjective well-being of university software deve...IJAEMSJORNAL
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandCarol Smith
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedKatrien Verbert
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARMark Billinghurst
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchJulia Kiseleva
 
Online evaluation for Local Government Information Unit
Online evaluation for Local Government Information UnitOnline evaluation for Local Government Information Unit
Online evaluation for Local Government Information UnitAlice Casey
 
Evaluating Online Participation Web 2.0 Engagement
Evaluating Online Participation Web 2.0 EngagementEvaluating Online Participation Web 2.0 Engagement
Evaluating Online Participation Web 2.0 EngagementInvolve
 
CoMo Game Dev - usability and user experience methods
CoMo Game Dev - usability and user experience methods CoMo Game Dev - usability and user experience methods
CoMo Game Dev - usability and user experience methods Isa Jahnke
 
Usability Testing: Making it fast, good, and cheap
Usability Testing: Making it fast, good, and cheapUsability Testing: Making it fast, good, and cheap
Usability Testing: Making it fast, good, and cheapWhitney Quesenbery
 
KLI Webinar: Eye Tracking The Mobile User Experience
KLI Webinar: Eye Tracking The Mobile User Experience KLI Webinar: Eye Tracking The Mobile User Experience
KLI Webinar: Eye Tracking The Mobile User Experience keylimeinteractive
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social MediaJeffrey Nichols
 
Computational Behaviour Modelling for the Internet of Things
Computational Behaviour Modelling for the Internet of ThingsComputational Behaviour Modelling for the Internet of Things
Computational Behaviour Modelling for the Internet of ThingsFahim Kawsar
 
National e-Learning Laboratory
National e-Learning LaboratoryNational e-Learning Laboratory
National e-Learning LaboratoryStephan Weibelzahl
 

Similar a A Journey into Evaluation: from Retrieval Effectiveness to User Engagement (20)

An engaging click
An engaging clickAn engaging click
An engaging click
 
To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?To be or not be engaged: What are the questions (to ask)?
To be or not be engaged: What are the questions (to ask)?
 
Optimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation SlidesOptimizing Mobile UX Design Webinar Presentation Slides
Optimizing Mobile UX Design Webinar Presentation Slides
 
Www tutorial2013 userengagement
Www tutorial2013 userengagementWww tutorial2013 userengagement
Www tutorial2013 userengagement
 
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
 
Situational analysis of the subjective well-being of university software deve...
Situational analysis of the subjective well-being of university software deve...Situational analysis of the subjective well-being of university software deve...
Situational analysis of the subjective well-being of university software deve...
 
Usability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG ClevelandUsability Testing Basics: What's it All About? at Web SIG Cleveland
Usability Testing Basics: What's it All About? at Web SIG Cleveland
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learned
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise AR
 
Detecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
 
Online evaluation for Local Government Information Unit
Online evaluation for Local Government Information UnitOnline evaluation for Local Government Information Unit
Online evaluation for Local Government Information Unit
 
Evaluating Online Participation Web 2.0 Engagement
Evaluating Online Participation Web 2.0 EngagementEvaluating Online Participation Web 2.0 Engagement
Evaluating Online Participation Web 2.0 Engagement
 
CoMo Game Dev - usability and user experience methods
CoMo Game Dev - usability and user experience methods CoMo Game Dev - usability and user experience methods
CoMo Game Dev - usability and user experience methods
 
Usability Testing: Making it fast, good, and cheap
Usability Testing: Making it fast, good, and cheapUsability Testing: Making it fast, good, and cheap
Usability Testing: Making it fast, good, and cheap
 
UX Research Methodologies
UX Research MethodologiesUX Research Methodologies
UX Research Methodologies
 
KLI Webinar: Eye Tracking The Mobile User Experience
KLI Webinar: Eye Tracking The Mobile User Experience KLI Webinar: Eye Tracking The Mobile User Experience
KLI Webinar: Eye Tracking The Mobile User Experience
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social Media
 
Computational Behaviour Modelling for the Internet of Things
Computational Behaviour Modelling for the Internet of ThingsComputational Behaviour Modelling for the Internet of Things
Computational Behaviour Modelling for the Internet of Things
 
Useful interactions
Useful interactionsUseful interactions
Useful interactions
 
National e-Learning Laboratory
National e-Learning LaboratoryNational e-Learning Laboratory
National e-Learning Laboratory
 

Más de Mounia Lalmas-Roelleke

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at ScaleMounia Lalmas-Roelleke
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Mounia Lalmas-Roelleke
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Mounia Lalmas-Roelleke
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataMounia Lalmas-Roelleke
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsMounia Lalmas-Roelleke
 
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement BiasMounia Lalmas-Roelleke
 
On the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsOn the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsMounia Lalmas-Roelleke
 
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
Penguins in Sweaters, or Serendipitous Entity Search on User-generated ContentMounia Lalmas-Roelleke
 
Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Mounia Lalmas-Roelleke
 

Más de Mounia Lalmas-Roelleke (12)

Engagement, Metrics & Personalisation at Scale
Engagement, Metrics &  Personalisation at ScaleEngagement, Metrics &  Personalisation at Scale
Engagement, Metrics & Personalisation at Scale
 
Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)Recommending and Searching (Research @ Spotify)
Recommending and Searching (Research @ Spotify)
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information Retrieval
 
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
Friendly, Appealing or Both? Characterising User Experience in Sponsored Sear...
 
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage DataDescribing Patterns and Disruptions in Large Scale Mobile App Usage Data
Describing Patterns and Disruptions in Large Scale Mobile App Usage Data
 
Predicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native AdvertisementsPredicting Pre-click Quality for Native Advertisements
Predicting Pre-click Quality for Native Advertisements
 
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 
On the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search MetricsOn the Reliability and Intuitiveness of Aggregated Search Metrics
On the Reliability and Intuitiveness of Aggregated Search Metrics
 
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content
 
Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)Evaluating Heterogeneous Information Access (Position Paper)
Evaluating Heterogeneous Information Access (Position Paper)
 

Último

一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftAanSulistiyo
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoilmeghakumariji156
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolinonuriaiuzzolino1
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
PowerDirector Explination Process...pptx
PowerDirector Explination Process...pptxPowerDirector Explination Process...pptx
PowerDirector Explination Process...pptxgalaxypingy
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptxAsmae Rabhi
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 

Último (20)

一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolino
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
PowerDirector Explination Process...pptx
PowerDirector Explination Process...pptxPowerDirector Explination Process...pptx
PowerDirector Explination Process...pptx
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 

A Journey into Evaluation: from Retrieval Effectiveness to User Engagement

  • 1. A  Journey  into  Evalua0on:   from  Retrieval  Effec0veness  to   User  Engagement   Mounia Lalmas Yahoo Labs London mounia@acm.org SPIRE 2015 – King’s College London
  • 2. This talk § Introduction to user engagement § Evaluation in information retrieval (retrieval effectiveness) § From retrieval effectiveness to user engagement (from intra-session to inter-session evaluation) (from small- to large-scale evaluation)
  • 3. This talk beyond the click beyond relevance towards user engagement
  • 5. What is user engagement? “User engagement is a quality of the user experience that emphasizes the phenomena associated with wanting to use a technological resource longer and frequently” (Attfield et al, 2011) self-report: happy, sad, enjoyment, … emotional, cognitive and behavioural connection that exists, at any point in time and over time, between a user and a technological resource analytics: click, upload, read, comment, share … physiology: gaze, body heat, mouse movement, …
  • 6. Why is it important to engage users? §  In today’s wired world, users have enhanced expectations about their interactions with technology … resulting in increased competition amongst the purveyors and designers of interactive systems. §  In addition to utilitarian factors, such as usability, we must consider the hedonic and experiential factors of interacting with technology, such as fun, fulfillment, play, and user engagement. (O’Brien, Lalmas & Yom-Tov, 2014)
  • 7. Online sites differ with respect to their engagement pattern Games Users spend much time per visit Search Users come frequently and do not stay long Social media Users come frequently and stay long Niche Users come on average once a week e.g. weekly post News Users come periodically, e.g. morning and evening Service Users visit site, when needed, e.g. to renew subscription (Lehmann etal, 2012)
  • 8. Characteristics of user engagement Novelty (Webster & Ho, 1997; O’Brien, 2008) Richness and control (Jacques et al, 1995; Webster & Ho, 1997) Aesthetics (Jacques et al, 1995; O’Brien, 2008) Endurability (Read, MacFarlane, & Casey, 2002; O’Brien, 2008) Focused attention (Webster & Ho, 1997; O’Brien, 2008) Reputation, trust and expectation (Attfield et al, 2011) Positive Affect (O’Brien & Toms, 2008) Motivation, interests, incentives, and benefits (Jacques et al., 1995; O’Brien & Toms, 2008) (O’Brien, Lalmas & Yom-Tov, 2014)
  • 9. Measuring user engagement Measures   Attributes   Self-report Questionnaire, interview, think-aloud and think after protocols Subjective Short- and long-term Lab and field Small scale Physiology EEG, SCL, fMRI eye tracking mouse-tracking Objective Short-term Lab and field Small and large scale Analytics within- and across-session metrics data science Objective Short- and long-term Field Large scale
  • 10. Attributes of user engagement § Scale (small versus large) § Setting (laboratory versus field) § Objective versus subjective § Temporality (short- versus long-term) We focus on 1.  Temporality: from intra- to inter-session 2.  Scalability: from small- to large-scale
  • 12. How to evaluate a search engine § Coverage   § Speed   § Query  language   § User  interface   § User  happiness   ›  Users  find  what  they  want  and  return  to  the  search  engine   ›  Users  complete  the  search  task,  where  search  is  a  means,  not   an  end   Sec. 8.6 (Manning, Raghavan & Schütze, 2008; Baeza-Yates & Ribeiro-Neto, 2011)
  • 13. Within an online session ›  July 2012 ›  2.5M users ›  785M page views ›  Categorization of the most frequent accessed sites •  11 categories (e.g. news), 33 subcategories (e.g. news finance, news society) •  760 sites from 70 countries/regions short sessions: average 3.01 distinct sites visited with revisitation rate 10% long sessions: average 9.62 distinct sites visited with revisitation rate 22% (Lehmann etal, 2013)
  • 14. Measuring user happiness Most  common  proxy:  relevance  of  search  results   Sec. 8.1 Relevant Retrieved all items §  User  informa)on  need  translated  into   a  query   §  Relevance  assessed  rela0ve  to     informa)on  need  not  the  query   §  Example:   ›  Informa0on  need:  I  am  looking  for  tennis   holiday  in  a  country  with  no  rain   ›  Query:  tennis  academy  good  weather   Evaluation measures: •  precision, recall, R-precision; precision@n; mean average precision; F-measure; … •  bpref; cumulative gains, … precision recall
  • 15. Measuring user happiness Most  common  proxy:  relevance  of  search  result   Sec. 8.1 Explicit signals Test collection methodology (TREC, CLEF, …) Human labeled corpora Implicit signals User behavior in online settings (clicks, skips, …)
  • 16. Examples of implicit signals in web search §  Number of clicks §  Click at given position §  Time to first click §  Skipping §  Abandonment rate §  Number of query reformulations §  Dwell time
  • 17. What is a happy user in web search 1.  The user information need is satisfied 2.  The user has learned about a topic and even about other topics 3.  The system was inviting and even fun to use In-the-moment engagement Users active on a site or stayed long Long-term engagement Users come back frequently and over a long-term period USER ENGAGEMENT
  • 19. Click-through rates CTR new ranking algorithm new design of search result page …
  • 20. I just wanted the phone number … I am totally happy J No clicks
  • 21. Dwell time DWELL TIME used a proxy of user experience Publisher click on an ad on mobile device Dwell time on non-optimized landing pages comparable and even higher than on mobile- optimized ones … when mobile optimized, users realize quickly whether they “like” the ad or not? (Lalmas etal, 2015) non-mobile optimized mobile optimized
  • 22. Multimedia search activities often driven by entertainment needs, not by information needs Relevance in multimedia search (Slaney, 2011)
  • 23. Explorative or serendipitous search (Miliaraki, Blanco & Lalmas, 2015)
  • 24. top most popular tweets top most popular tweets + geographical diverse Being from a central or peripheral location makes a difference. Peripheral users did not perceive the timeline as being diverse Objectivity versus subjectivity It should never be just about the algorithm, but also how users respond to what the algorithm returns to them à USER ENGAGEMENT (Eduardo Graells, 2015)
  • 26. Interactive Information Retrieval (Ingwersen, Human Aspects in IR, ESSIR 2011) USERENGAGEMENT
  • 27. Beyond clicks and relevance towards user engagement § From intra- to inter-session evaluation ›  Dwell time and absence time ›  Linking strategy ›  Mobile advertising § From small- to large-scale evaluation ›  Eye-tracking and user engagement questionnaire ›  Mouse tracking and user engagement questionnaire happy users come back we need to properly identify the happy users
  • 29. From short- to long-term engagement: From intra- to inter-session engagement intra-session metric(s) inter-session metric(s) how users engage within a session? how users engage across sessions? We monitor We know what it will mean futureengagement proxy
  • 31. intra-session metrics •  Dwell time •  Session duration •  Bounce rate •  Play time (video) •  Mouse movement •  Click through rate (CTR) •  Number of pages viewed (click depth) •  Conversion rate •  Number of UCG (comments) •  … Dwell time as a proxy of user interest Dwell time as a proxy of relevance Dwell time as a proxy of conversion Dwell time as a proxy of post-click ad quality … User engagement metrics intra-session inter-session
  • 32. Dwell time § Definition The contiguous time spent on a site or web page § Similar measures Play time (for video sites) § Cons Not clear that the user was actually looking at the site while there à blur/focus Distribution of dwell times on 50 websites (O’Brien, Lalmas & Yom-Tov, 2014)
  • 33. Dwell time Dwell time varies by site type: •  leisure sites tend to have longer dwell times than news, e-commerce, etc. Dwell time has a relatively large variance even for the same site Dwell time on 50 websites (tourists, VIP, active … users) (O’Brien, Lalmas & Yom-Tov, 2014)
  • 34. Dwell time across sessions or absence time
  • 35. The context – search experience
  • 36. The context – search experience
  • 37. Absence time and survival analysis story 1 story 2 story 3 story 4 story 5 story 6 story 7 story 8 story 9 0 5 10 15 20 0.00.20.40.60.81.0 Users (%) who did come back Users (%) who read story 2 but did not come back after 10 hours SURVIVE DIE DIE = RETURN TO SITE èSHORT ABSENCE TIME hours
  • 38. Absence time applied to search Ranking function on Yahoo Answer Japan Two-weeks click data on Yahoo Answer Japan: search One millions users Six ranking functions 30-minute session boundary
  • 39. survival analysis: high hazard rate (die quickly) = short absence 5 clicks control=noclick Absence time and number of clicks on search result page 3 clicks
  • 40. Absence time – search experience 1.  No click means a bad user experience 2.  Clicking between 3-5 results leads to same user experience 3.  Clicking on more than 5 results reflects poorer user experience; users cannot find what they are looking for 4.  Clicking lower in the ranking (2nd, 3rd) suggests more careful choice from the user (compared to 1st) 5.  Clicking at bottom is a sign of low quality overall ranking 6.  Users finding their answers quickly (time to 1st click) return sooner to the search application 7.  Returning to the same search result page is a worse user experience than reformulating the query search session metrics à absence time (Dupret & Lalmas, 2013)
  • 42. Related  off-­‐site  content   The context – Linking strategy in online news News provider p(absence12h) No Click Off-site click Off-site link à absence time Providing links to related off-site content has a positive long-term effect (Lehmann etal, In Progress)
  • 43. The Context – Mobile advertising 0% 200% 400% 600% short ad clicks long ad clicks adclickdifference Dwell time à ad click Positive post-click experience (“long” clicks) has an effect on users clicking on ads again (Lalmas etal, 2015)
  • 44. Beyond clicks and relevance towards user engagement § From intra- to inter-session evaluation ›  Dwell time and absence time ›  Linking strategy ›  Mobile advertising happy users come back
  • 46. Small scale measurement – focused attention questionnaire 5-point scale (strong disagree to strong agree) 1.  I lost myself in this news tasks experience 2.  I was so involved in my news tasks that I lost track of time 3.  I blocked things out around me when I was completing the news tasks 4.  When I was performing these news tasks, I lost track of the world around me 5.  The time I spent performing these news tasks just slipped away 6.  I was absorbed in my news tasks 7.  During the news tasks experience I let myself go (O'Brien & Toms, 2010)
  • 47. Small scale measurement – PANAS questionnaire (10 positive items and 10 negative items) §  You feel this way right now, that is, at the present moment [1 = very slightly or not at all; 2 = a little; 3 = moderately; 4 = quite a bit; 5 = extremely] [randomize items] distressed, upset, guilty, scared, hostile, irritable, ashamed, nervous, jittery, afraid interested, excited, strong, enthusiastic, proud, alert, inspired, determined, attentive, active (Watson, Clark & Tellegen, 1988)
  • 48. Small scale measurement – gaze and self-reporting News interest 57 users reading task (114) •  questionnaire (qualitative data) •  record eye tracking •  (quantitative data) Three metrics: gaze, focus attention and positive affect All three metrics align: interesting content promote all engagement metrics (Arapakis etal, 2014)
  • 49. From small- to large-scale measurement – mouse tracking §  Navigation & interaction with digital environment usually involves the use of a mouse (selecting, positioning, clicking) §  Several works show mouse cursor as weak proxy of gaze (attention) §  Low-cost, scalable alternative §  Can be performed in a non-invasive manner, without removing users from their natural setting
  • 50. Relevance, dwell time & cursor “reading” a relevant long document vs “scanning” a long non-relevant document (Guo & Agichtein, 2012)
  • 52. Mouse tracking and self-reporting §  324 users from Amazon Mechanical Turk (between subject design) §  Two tasks (reading and search) §  “Normal vs Ugly” interface §  Questionnaires (qualitative data) ›  focus attention, positive effect ›  interest, aesthetics §  Mouse tracking (quantitative data) ›  movement speed, movement rate, click rate, pause length, percentage of time still (Warnock & Lalmas, 2015)
  • 53. Mouse tracking could not tell much about •  focused attention and positive affect •  user interests in the task/topic •  aesthetics BUT BUT BUT BUT ›  “ugly” variant did not result in lower USER aesthetics scores ›  although BBC > Wikipedia BUT – the comments left … ›  Wikipedia: “The website was simply awful. Ads flashing everywhere, poor text colors on a dark blue background.”; “The webpage was entirely blue. I don't know if it was supposed to be like that, but it definitely detracted from the browsing experience.” ›  BBC News: “The website's layout and color scheme were a bitch to navigate and read.”; “Comic sans is a horrible font.”
  • 54. Flawed methodology? Non-existing signal? Wrong metric? Wrong measure? § Hawthorne Effect § Design ›  Usability versus engagement ›  Within- versus between-subject § Mouse movement was not sophisticated enough
  • 55. Mouse Gestures à Features x0y0 x1y1 x2y2 x3y3 x4y4 x5y5 x6y6 x7y7 x8y8 t Δt rest Δt rest resting cursor (500ms) resting cursor (1000ms) resting cursor (1500ms) click 40006000 y ●● ● ●●●●●●●●●●● ●●● (Arapakis, Lalmas & Valkanas, 2014) 22 users reading two articles 176,550 cursor positions 2,913 mouse gestures
  • 56. Towards a taxonomy of mouse gestures for user engagement measurement §  The top-ranked clustering configuration is the Spectral Clustering for the original dataset, with hyperbolic tangent kernel, for k = 38 •  certain types of mouse gestures occur more or less often, depending on user interest in article •  significant correlations between certain types of mouse gestures and self- report measures •  cursor behaviour goes beyond measuring frustration •  inform about the positive and negative interaction
  • 57. Beyond clicks and relevance towards user engagement § From small- to large-scale evaluation ›  Eye-tracking and user engagement questionnaire ›  Mouse tracking and user engagement questionnaire we need to properly identify the happy users
  • 59. Towards User Engagement happy users come back we need to properly identify the happy users
  • 60. §  “If you cannot measure it, you cannot improve it” William Thomson (Lord Kelvin) §  “You cannot control what you cannot measure” DeMarco §  “The way you measure is more important than what you measure” Art Gust Thank you