SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Ryen White, Susan Dumais
   Microsoft Research

     Presented By:
  Hemanth Kumar Mantri
What is Search Engine Switching?

  ● Voluntary transition from one Web search engine to
    another
     ○ Query Google then query Yahoo/Bing


  ● Focus on within-session switching

  ● Variants:
     ○ between-session switching: switch for different tasks
     ○ long-term switching: suddenly or gradually over time
Outline

● Introduction and Motivation
● Methodology
   ○ Log based analysis
   ○ User survey
● Characterizing search engine switching
   ○ Overview of log and survey data
   ○ Pre-/post-switch behaviors
● Predicting switching
Motivation
● Common to users
   ○ We switch with-in and between sessions
● Engine switching is important to search providers
   ○ customers lost => $$ Lost
● Little known things:
   ○ Rationale behind switching
   ○ Switching behavior
   ○ Features most useful in predicting switching events
● Prediction helps
   ○ origin engine could improve itself
   ○ destination engine could get ready
Methods

● Log-based Analysis
   ○ From several consenting browser toolbar users (6 months)
   ○ Only English Speaking US locale
   ○ Search sessions extracted from logs
      ■ Start with query and end with 30-minute inactivity timeout
      ■ Can have queries to multiple engines
● User Survey/Questionnaire
   ○ 488 randomly chosen Microsoft employees
   ○ Switching rationale (logs don't have this)
   ○ Recent switching episodes, frequency of switching,
     patterns of behavior prior to switching
Switching Characteristics - Logs

  ● 4% of all search sessions have >= 1 switching
    event
  ● Switching events:
     ○ 58.6 million in 6-month period
     ○ 12.6% involved same query
     ○ Two-thirds from browser search box
  ● Users (14.2 million):
     ○ 72.6% used multiple engines
     ○ 50% switched search engine within a session
     ○ 67.6% used different engines for different sessions
     ○ 4.4% defected from engine A to engine B and never
       came back
Switching Characteristics - Logs
  ● Types of switching:
     ○ Browser
     ○ Navigate
     ○ QueryToNavigate
Switching Characteristics - Logs
● Longer Sessions => Frequent Switching
Switching Characteristics - Survey

 ● 70.5% of survey respondents reported having
   switched
    ○ Consistent with the Logs (72.6%)
 ● Users who did not switch:
    ○ Were satisfied with current engine (57.8%)
    ○ Believed no other engine would perform better (24.0%)
    ○ Felt that it was too much effort to switch (6.8%)
    ○ Other reasons included brand loyalty, trust, privacy
 ● Within-session switching:
    ○ 24.4% of switching users did so “Often” or “Always”
    ○ 66.8% of switching users did so “Sometimes”
Why ?




                                            http://www.searchenginepeople.com/




● Three types of reasons:
   ○ Dissatisfaction with original engine
   ○ Desire to verify or find additional information
   ○ User preference
How do search engine users
         behave
before and after switching?
Pre-switch Behavior



 ● Mine logs to to analyze pre-switch interactions
 ● Consider five actions:
    ○ Query
    ○ Pagination (ask next page in the results)
    ○ Click on a result (SERP)
    ○ Click other page (non-SERP)
    ○ Navigate to page without click (e.g., address bar)
Pre-switch Behavior




● Most common are queries and non-SERP clicks
● This is the action immediately before the switch
Pre-switch Behavior


                Oscillations due
               to bucketing noise




● P(Re-visitation) also increases rapidly just before a
  switch
Pre-switch Behavior (Survey)
Question: “Is there anything about your search behavior
 immediately preceding a switch that may indicate to an
    observer that you are about to switch engines?”

● Popular Answers:
   ○ Try several small query changes in pretty quick
     succession
   ○ Go to more than the first page of results, again often in
     quick succession and often without clicks
   ○ Go back and forth from SERP to individual results,
     without spending much time on any
   ○ Click on lots of links, then switch engine for additional info
   ○ Do not immediately click on something
Pre-switch behavior Encoding

● Encode the actions into a sequence of characters
   ○ <Action, PageType>
   ○ Action:
      ■ (q)uery, (p)agination, (b)ack one page, etc.
   ○ PageType:
      ■ Basic: SE(R)P, non-SER(P)
      ■ Advanced: Include dwell times (short, medium,
        long)
   ○ Eg: qRqRqRqRnP (= qR*nP)
      ■ series of queries, each time viewing the resultant
        SERP but not clicking on any search results and
        then (n)avigating to non-SER(P) page using
        browser address bar
Sequence Motifs
● Patterns in the sequences that occur frequently
● Use these patterns to see if they have genuine
  association with switching behavior
● Eg: qR*, repeated submission of queries followed by no
  SERP clicks is most common sequence pre-switching
● The Analyses using these sequence patterns were in
  consistent with the first three popular answers they got in
  the user survey.
Post-switch Behavior

 ● Analyzed switching events in the logs
   to determine the frequency of post-switch
   actions
 ● Consider six actions:
    ○ Click result (SERP)
    ○ Navigate to page without click (e.g., address bar)
    ○ Re-query destination engine
    ○ Re-query origin engine (switch back)
    ○ Query on other engine (switch to a third engine)
    ○ End session
Post-switch Behavior




● Extending the analysis beyond next action:
   ○ 20% of switches eventually lead to return to origin engine
   ○ 6% of switches eventually lead to use of third engine
● > 50% led to a result click. Are users satisfied?
Post-Switch Satisfaction
● Measures of user effort / activity (# Queries, #
  Actions)
● Measure of the quality of the interaction
   ○ % queries with No Clicks, # Actions to SAT (>30sec dwell)

                     # Queries             # Actions
     Activity
                 Origin Destination    Origin   Destination
   All Queries   3.14        3.70      9.85        11.62
  Same Queries   3.08        3.73      9.03        10.25
                    % NoClicks      # Actions to SatAction
     Success
                 Origin Destination    Origin   Destination
   All Queries   49.7        52.7      3.81         4.71
  Same Queries   54.5        59.7      3.67         4.61
●
● Users issue more queries/actions; seem less satisfied
  (higher %NoClicks and more actions to SAT)
How can we predict
   switching?
Predicting Switching
● Goal: Predict whether next action in a session is switch
● Learning model using logistic regression
● Feature classes:
   ○ Query – the last query issued in current session
   ○ Session – the current session
   ○ User – the current user
● Aim of experiment not to optimize model
   ○ Determine predictive value of query/session/user features
   ○ Model held constant, features combinations varied
Query features
abandonmentRate: Fraction of times query has no SERP click
avgClickPos: Average SERP click position (starts at zero)
avgNumClicks: Average number of SERP clicks
avgNumAds: Average number of advertisements shown
avgNumQuerySuggestions: Average number of query suggestions
avgNumResults: Average number total search results
avgTokenLength: Average length of query tokens
followOnRatio: Fraction of times query leads to another query
frequencyCount: Total query frequency
hasAlteration: True if alteration applied (e.g., remove plurals)
hasOperators: True if query has operators (e.g., site:)
hasQuotes: True if query contains quotation marks
hasSpellCorrection: True if spell correction fires
paginationRate: Fraction of times request next page of results
queryLength: Query length in characters
queryTokens: Query length in tokens
Session features

avgTimeBetweenQueries: Average time between queries
currentEngine: Current search engine name
currentSequenceAdvanced: Advanced string representation of session so far
currentSequenceBasic: Basic string representation of session so far
hasMotifAdvanced: True if currentSequenceAdvanced has seq. motif
hasMotifBasic: True if currentSequenceBasic has sequence motif
numBacks: Number of revisits in the session so far
numPaginations: Number of paginations in session so far
queriesInSession: Number of queries in the session so far
ratioQueriesWithNoClicks: Fraction of queries with no clicks
ratioQueriesWithOneClick: Fraction of queries with one click
ratioQueriesWithMultipleClicks: Fraction of queries with many clicks
timeInSession: Time in the session so far (in seconds)
URLsInSession: Number of URLs in session so far
User features

avgSessionLengthQueries: Average session length in queries
avgSessionLengthTime: Average session length in time
avgSessionLengthURLs: Average session length in URLs
avgQueryLength: Average query length in characters
avgQueryTokens: Average query length in tokens
propPreferredEngine: Fraction queries issued to preferred engine
sessionCount: Total number of sessions
Method

● Task: Predict if next session action is engine switch
● Used session states, where state =
   ○ Observed interaction in a session to a given point
   ○ Also includes most recent query and user id (to get history)
● Trained on 100K states randomly sampled from logs
   ○ Ratio during sampling 1 : 99 (switch : no-switch)
   ○ Artificially re-balanced the training data and used bagging
● Tested on 100 x 10K random samples from unseen
  logs
● Precision and recall computed over 100 samples
Results
      All                       All sessions with 3 or more queries so
         sessions               far




● Models trained on all features best; Session best
  class
● Performance improves for longer sessions
   ○ More session information available
Predicting Switching - Usage
● Switch predictions seem useable, especially at low
  recall
● What can we do with switch predictions?
● Origin engine – predict switch away from them
   ○ Offer additional query suggestions, reduce number of ads
   ○ Enhance UI with richer support for sorting or filtering
   ○ Devote more computational resources to ranking
● Destination engine – predict switch to them (via
  toolbar)
   ○ Pre-fetch search results in anticipation of incoming user
Thank You!
References:
 ● http://research.microsoft.com/en-
   us/um/people/ryenw/talks/pdf/whitedumaiscikm2009.pdf
 ● http://dl.acm.org/citation.cfm?id=1645967
Discussion!

Más contenido relacionado

Similar a Search Engine Switching

Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doMetehan Çetinkaya
 
Getting Started with Server-Side Testing
Getting Started with Server-Side TestingGetting Started with Server-Side Testing
Getting Started with Server-Side TestingOptimizely
 
OutSystems Tips and Tricks
OutSystems Tips and TricksOutSystems Tips and Tricks
OutSystems Tips and TricksOutSystems
 
[Webinar] Getting started with server-side testing - presented by WiderFunnel...
[Webinar] Getting started with server-side testing - presented by WiderFunnel...[Webinar] Getting started with server-side testing - presented by WiderFunnel...
[Webinar] Getting started with server-side testing - presented by WiderFunnel...Chris Goward
 
8220 sad inquiry
8220 sad inquiry8220 sad inquiry
8220 sad inquirybbass03
 
Video Recommendation Engines as a Service
Video Recommendation Engines as a ServiceVideo Recommendation Engines as a Service
Video Recommendation Engines as a ServiceKamil Sindi
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automationRicardo Bánffy
 
Google's secret its algorithm updates
Google's secret its algorithm updatesGoogle's secret its algorithm updates
Google's secret its algorithm updatesTushar Malviya
 
Letting Users Choose Recommender Algorithms
Letting Users Choose Recommender AlgorithmsLetting Users Choose Recommender Algorithms
Letting Users Choose Recommender AlgorithmsMichael Ekstrand
 
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataВладимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataMail.ru Group
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Lucidworks
 
Dynamic Search and Beyond
Dynamic Search and BeyondDynamic Search and Beyond
Dynamic Search and BeyondGrace Hui Yang
 
Pivotal Tracker - Research Findings
Pivotal Tracker - Research FindingsPivotal Tracker - Research Findings
Pivotal Tracker - Research FindingsPaulina Galindo
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)M Baddar
 
Advanced automated visual testing at selenium conf india 2020
Advanced automated visual testing at selenium conf india 2020Advanced automated visual testing at selenium conf india 2020
Advanced automated visual testing at selenium conf india 2020Shweta Sharma
 
Strata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet videoStrata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet videoKamil Sindi
 
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with It
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with ItDenver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with It
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with ItBrian Statkevicus
 

Similar a Search Engine Switching (20)

Our Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.doOur Story With ClickHouse at seo.do
Our Story With ClickHouse at seo.do
 
Getting Started with Server-Side Testing
Getting Started with Server-Side TestingGetting Started with Server-Side Testing
Getting Started with Server-Side Testing
 
OutSystems Tips and Tricks
OutSystems Tips and TricksOutSystems Tips and Tricks
OutSystems Tips and Tricks
 
[Webinar] Getting started with server-side testing - presented by WiderFunnel...
[Webinar] Getting started with server-side testing - presented by WiderFunnel...[Webinar] Getting started with server-side testing - presented by WiderFunnel...
[Webinar] Getting started with server-side testing - presented by WiderFunnel...
 
8220 sad inquiry
8220 sad inquiry8220 sad inquiry
8220 sad inquiry
 
Video Recommendation Engines as a Service
Video Recommendation Engines as a ServiceVideo Recommendation Engines as a Service
Video Recommendation Engines as a Service
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automation
 
Google's secret its algorithm updates
Google's secret its algorithm updatesGoogle's secret its algorithm updates
Google's secret its algorithm updates
 
Letting Users Choose Recommender Algorithms
Letting Users Choose Recommender AlgorithmsLetting Users Choose Recommender Algorithms
Letting Users Choose Recommender Algorithms
 
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataВладимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
 
Dynamic Search and Beyond
Dynamic Search and BeyondDynamic Search and Beyond
Dynamic Search and Beyond
 
Role of Data Science in eCommerce
Role of Data Science in eCommerceRole of Data Science in eCommerce
Role of Data Science in eCommerce
 
Pivotal Tracker - Research Findings
Pivotal Tracker - Research FindingsPivotal Tracker - Research Findings
Pivotal Tracker - Research Findings
 
Ranking System for travel search (PoC)
Ranking System for travel search (PoC)Ranking System for travel search (PoC)
Ranking System for travel search (PoC)
 
Recommender systems
Recommender systems Recommender systems
Recommender systems
 
Advanced automated visual testing at selenium conf india 2020
Advanced automated visual testing at selenium conf india 2020Advanced automated visual testing at selenium conf india 2020
Advanced automated visual testing at selenium conf india 2020
 
Strata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet videoStrata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet video
 
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with It
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with ItDenver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with It
Denver MuleSoft Meetup Feb 24, 2021 - What's Batch Got to Do with It
 

Último

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Último (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Search Engine Switching

  • 1. Ryen White, Susan Dumais Microsoft Research Presented By: Hemanth Kumar Mantri
  • 2. What is Search Engine Switching? ● Voluntary transition from one Web search engine to another ○ Query Google then query Yahoo/Bing ● Focus on within-session switching ● Variants: ○ between-session switching: switch for different tasks ○ long-term switching: suddenly or gradually over time
  • 3. Outline ● Introduction and Motivation ● Methodology ○ Log based analysis ○ User survey ● Characterizing search engine switching ○ Overview of log and survey data ○ Pre-/post-switch behaviors ● Predicting switching
  • 4. Motivation ● Common to users ○ We switch with-in and between sessions ● Engine switching is important to search providers ○ customers lost => $$ Lost ● Little known things: ○ Rationale behind switching ○ Switching behavior ○ Features most useful in predicting switching events ● Prediction helps ○ origin engine could improve itself ○ destination engine could get ready
  • 5. Methods ● Log-based Analysis ○ From several consenting browser toolbar users (6 months) ○ Only English Speaking US locale ○ Search sessions extracted from logs ■ Start with query and end with 30-minute inactivity timeout ■ Can have queries to multiple engines ● User Survey/Questionnaire ○ 488 randomly chosen Microsoft employees ○ Switching rationale (logs don't have this) ○ Recent switching episodes, frequency of switching, patterns of behavior prior to switching
  • 6. Switching Characteristics - Logs ● 4% of all search sessions have >= 1 switching event ● Switching events: ○ 58.6 million in 6-month period ○ 12.6% involved same query ○ Two-thirds from browser search box ● Users (14.2 million): ○ 72.6% used multiple engines ○ 50% switched search engine within a session ○ 67.6% used different engines for different sessions ○ 4.4% defected from engine A to engine B and never came back
  • 7. Switching Characteristics - Logs ● Types of switching: ○ Browser ○ Navigate ○ QueryToNavigate
  • 8. Switching Characteristics - Logs ● Longer Sessions => Frequent Switching
  • 9. Switching Characteristics - Survey ● 70.5% of survey respondents reported having switched ○ Consistent with the Logs (72.6%) ● Users who did not switch: ○ Were satisfied with current engine (57.8%) ○ Believed no other engine would perform better (24.0%) ○ Felt that it was too much effort to switch (6.8%) ○ Other reasons included brand loyalty, trust, privacy ● Within-session switching: ○ 24.4% of switching users did so “Often” or “Always” ○ 66.8% of switching users did so “Sometimes”
  • 10. Why ? http://www.searchenginepeople.com/ ● Three types of reasons: ○ Dissatisfaction with original engine ○ Desire to verify or find additional information ○ User preference
  • 11. How do search engine users behave before and after switching?
  • 12. Pre-switch Behavior ● Mine logs to to analyze pre-switch interactions ● Consider five actions: ○ Query ○ Pagination (ask next page in the results) ○ Click on a result (SERP) ○ Click other page (non-SERP) ○ Navigate to page without click (e.g., address bar)
  • 13. Pre-switch Behavior ● Most common are queries and non-SERP clicks ● This is the action immediately before the switch
  • 14. Pre-switch Behavior Oscillations due to bucketing noise ● P(Re-visitation) also increases rapidly just before a switch
  • 15. Pre-switch Behavior (Survey) Question: “Is there anything about your search behavior immediately preceding a switch that may indicate to an observer that you are about to switch engines?” ● Popular Answers: ○ Try several small query changes in pretty quick succession ○ Go to more than the first page of results, again often in quick succession and often without clicks ○ Go back and forth from SERP to individual results, without spending much time on any ○ Click on lots of links, then switch engine for additional info ○ Do not immediately click on something
  • 16. Pre-switch behavior Encoding ● Encode the actions into a sequence of characters ○ <Action, PageType> ○ Action: ■ (q)uery, (p)agination, (b)ack one page, etc. ○ PageType: ■ Basic: SE(R)P, non-SER(P) ■ Advanced: Include dwell times (short, medium, long) ○ Eg: qRqRqRqRnP (= qR*nP) ■ series of queries, each time viewing the resultant SERP but not clicking on any search results and then (n)avigating to non-SER(P) page using browser address bar
  • 17. Sequence Motifs ● Patterns in the sequences that occur frequently ● Use these patterns to see if they have genuine association with switching behavior ● Eg: qR*, repeated submission of queries followed by no SERP clicks is most common sequence pre-switching ● The Analyses using these sequence patterns were in consistent with the first three popular answers they got in the user survey.
  • 18. Post-switch Behavior ● Analyzed switching events in the logs to determine the frequency of post-switch actions ● Consider six actions: ○ Click result (SERP) ○ Navigate to page without click (e.g., address bar) ○ Re-query destination engine ○ Re-query origin engine (switch back) ○ Query on other engine (switch to a third engine) ○ End session
  • 19. Post-switch Behavior ● Extending the analysis beyond next action: ○ 20% of switches eventually lead to return to origin engine ○ 6% of switches eventually lead to use of third engine ● > 50% led to a result click. Are users satisfied?
  • 20. Post-Switch Satisfaction ● Measures of user effort / activity (# Queries, # Actions) ● Measure of the quality of the interaction ○ % queries with No Clicks, # Actions to SAT (>30sec dwell) # Queries # Actions Activity Origin Destination Origin Destination All Queries 3.14 3.70 9.85 11.62 Same Queries 3.08 3.73 9.03 10.25 % NoClicks # Actions to SatAction Success Origin Destination Origin Destination All Queries 49.7 52.7 3.81 4.71 Same Queries 54.5 59.7 3.67 4.61 ● ● Users issue more queries/actions; seem less satisfied (higher %NoClicks and more actions to SAT)
  • 21. How can we predict switching?
  • 22. Predicting Switching ● Goal: Predict whether next action in a session is switch ● Learning model using logistic regression ● Feature classes: ○ Query – the last query issued in current session ○ Session – the current session ○ User – the current user ● Aim of experiment not to optimize model ○ Determine predictive value of query/session/user features ○ Model held constant, features combinations varied
  • 23. Query features abandonmentRate: Fraction of times query has no SERP click avgClickPos: Average SERP click position (starts at zero) avgNumClicks: Average number of SERP clicks avgNumAds: Average number of advertisements shown avgNumQuerySuggestions: Average number of query suggestions avgNumResults: Average number total search results avgTokenLength: Average length of query tokens followOnRatio: Fraction of times query leads to another query frequencyCount: Total query frequency hasAlteration: True if alteration applied (e.g., remove plurals) hasOperators: True if query has operators (e.g., site:) hasQuotes: True if query contains quotation marks hasSpellCorrection: True if spell correction fires paginationRate: Fraction of times request next page of results queryLength: Query length in characters queryTokens: Query length in tokens
  • 24. Session features avgTimeBetweenQueries: Average time between queries currentEngine: Current search engine name currentSequenceAdvanced: Advanced string representation of session so far currentSequenceBasic: Basic string representation of session so far hasMotifAdvanced: True if currentSequenceAdvanced has seq. motif hasMotifBasic: True if currentSequenceBasic has sequence motif numBacks: Number of revisits in the session so far numPaginations: Number of paginations in session so far queriesInSession: Number of queries in the session so far ratioQueriesWithNoClicks: Fraction of queries with no clicks ratioQueriesWithOneClick: Fraction of queries with one click ratioQueriesWithMultipleClicks: Fraction of queries with many clicks timeInSession: Time in the session so far (in seconds) URLsInSession: Number of URLs in session so far
  • 25. User features avgSessionLengthQueries: Average session length in queries avgSessionLengthTime: Average session length in time avgSessionLengthURLs: Average session length in URLs avgQueryLength: Average query length in characters avgQueryTokens: Average query length in tokens propPreferredEngine: Fraction queries issued to preferred engine sessionCount: Total number of sessions
  • 26. Method ● Task: Predict if next session action is engine switch ● Used session states, where state = ○ Observed interaction in a session to a given point ○ Also includes most recent query and user id (to get history) ● Trained on 100K states randomly sampled from logs ○ Ratio during sampling 1 : 99 (switch : no-switch) ○ Artificially re-balanced the training data and used bagging ● Tested on 100 x 10K random samples from unseen logs ● Precision and recall computed over 100 samples
  • 27. Results All All sessions with 3 or more queries so sessions far ● Models trained on all features best; Session best class ● Performance improves for longer sessions ○ More session information available
  • 28. Predicting Switching - Usage ● Switch predictions seem useable, especially at low recall ● What can we do with switch predictions? ● Origin engine – predict switch away from them ○ Offer additional query suggestions, reduce number of ads ○ Enhance UI with richer support for sorting or filtering ○ Devote more computational resources to ranking ● Destination engine – predict switch to them (via toolbar) ○ Pre-fetch search results in anticipation of incoming user
  • 29. Thank You! References: ● http://research.microsoft.com/en- us/um/people/ryenw/talks/pdf/whitedumaiscikm2009.pdf ● http://dl.acm.org/citation.cfm?id=1645967