SlideShare a Scribd company logo
1 of 23
Yahoo! Answers
                              Siju Varghese
                         Engineer, Yahoo! Answers



IIT Chennai, Mar’12
Yahoo! Answers - background
    Community Question Answering (CQA)
       “Just because Google exists does not mean you                 Home > All Categories > Arts & Humanities > Philosophy
      should stop asking things”, A. Totsis, TechCrunch,
                             2010

•    Largest Community Question Answering
     Site with more than 1B answers (
     http://bit.ly/am1LwL )

•    4th most popular property on yahoo
     network

•    3.5 answers/question, 11 minutes to first
     answer on an average.

•    Community Moderated, no editorial
     control.

•    Supports 12 langs.


•    Model:
       –   Content-driven (different from Quora , which
           is social-driven)
       –   Subjective quality model (different from
           WikiAnswers or StackOverflow)
             •   By default broadcast questions to *all* potential
                 users
             •   Asker picks best answer (subjectivity rules):
                 “Quality in the eye of the asker” even if the
                 community disagrees
A veritable gold mine of data…

• User Interest Analysis
  – Seekers
  – Knowledge Experts



• Sentiment Analysis
FEW RESEARCH PROJECTS
Lets meet some Answerers
Lilly http://answers.yahoo.com/activity?show=OzaaPpm8aa
Jane http://answers.yahoo.com/activity?show=hFEx4VF7aa
Alice http://answers.yahoo.com/activity?show=SZRbSI3Uaa
• Problem: About 15% of incoming questions are
  unanswered

• Key observation: Recurrent questions are
  prevalent in many categories
  –   What are the symptoms of cervical cancer?
  –   How much weight Should I lose?
  –   How do you train your cat to use the litterbox?
  –   What is a black hole?


• Possible Solution: use past answered
  questions to automatically answer new
  questions.
Why is it difficult?
Title:                                            •   Non-informative, ambiguous title
   How often should i jump?
                                                  •   Complex information need
Body:
                                                  •   Detailed and personal
   I have a 6yo cob, and we've started
   jumping, he's done it before, we're only
   jumping around 1-1,8 foot. He isn't fat        •   Multiple questions posed in one
   and he's quite fit. I jumped him on
   Friday, but did just 3 small jumps, i did
   flatwork on Saturday, and jumped him 3         •   Grammatical errors, slang
   times today then hacked out, tomorrow,
   should i just flatwork him or would he be
   ok to jump a few small singles? Like 3? I      •    Non factual, rather opinion and
   won't ride him again until                         recommendation is expected
   Wednesday/Thursday to give him a
   break, what do you think? He is fine
   doing what he has done, he doesn't get         •   Extensive variability among
   sweaty much or tired, i myself am not              questions: in language style,
   ready to go any higher and would like to           cultural aspects, degree of detail
   just do maybe 1 or 2 jumps? Is he ok to
   jump tomorrow? If not when next? Thank
   you for the help! I appreciate it all! :) xx
Auto Answering System
Findings
  • Robots have been answering for about a week,
    saturating their daily answering quota.
  • Significant fraction of the answers have been
    chosen as Best Answer – much better than an
    average user




  • Their responses elicited discussions, and they
    acquired several fans
Learning from the Past: Answering New Questions with Past Answers”
    - Anna Shtok, Gideon Dror, Idan Szpektor and Yoelle Maarek
http://www2012.wwwconference.org/program/accepted-papers/main-scientific-tracks/
Get me a question that I can answer
• Users want to answer new questions
   – No social information on such questions (item cold-start)
   – This known scenario in recommendation systems is our typical case


• Many users are new
   – Hardly any answering history (user cold-start)
   – The majority of registered users


• Current solution: show most recent questions in the
  category

• Goal: a question recommendation model that fits all
  user types
   – Active users,
   – new users,
   – surfers
Question Recommender

   • Recommender system approach, (like for movie
     recommendation but in a much larger and very sparse
     space)




   • Learn from past interactions with users and push
     relevant open questions


I want to answer, who has a question? Yahoo! Answers Recommender System ”
    - G. Dror, Y. Koren, Y. Maarek and I. Szpektor, KDD’2011, San-Diego, CA
Sentiment Analysis
• Why do people ask questions on Answers?
  – Users more likely to register with Y!A to ask things
    they can‘t ask elsewhere (“conversational”,
    ”personal”)
  – When you can’t find what you are looking for on www


• Different (demographic) groups use it differently
  – Women ask more conversational questions
  – Older people ask more informational questions
  – Women are more sentimental when answering a
    question than men.
  – In terms of attitude, men are more neutral, whereas
    women have more positive attitude in their answers
Credits:
Giovanni Gardelli, Ingmar Weber, Antti Ukkonen, B. Barla Cambazoglu – Y! Research Barcelona
Onur Kucuktunc, The Ohio State University, US
Hakan Ferhatosmanoglu, Bilkent University, Turkey
Some more…
• Relating answers activity to activities around
  the world.
  – Huge drop in science & math category in mid-
    December every year. This category is heavily used
    for homework help -- effect of Christmas break?


• ~ 35 small and big hacks during the 2011 Y!
  hack day event @ blr.
  – Android/iOS apps.
     • Location awareness.
  – Promote relevant Q&A on content pages.
  – Answers on other channels : Messgener, SMS etc.
  – Custom UI themes
Enablers
• Answers Data on the Grid

• Answers Api’s
  – V1 : http://developer.yahoo.com/answers/
     • Provides only read capability.
  – V2 (internal beta)
     • Offers full fledged capabilities – Read & Write.
     • Preview available for this HackU event.
Credits
• Yahoo Research Labs – Haifa, Israel.


     Gideon Dror, Yehuda Koren, Yoelle Maarek, Dan Pelleg, Idan Szpektor, Oleg Rokhlenko




• Yahoo! Research Labs, Barcelona
        Giovanni Gardelli, Ingmar Weber, Antti Ukkonen, B. Barla Cambazoglu




• Yahoo! Answers Engineering, Bangalore

More Related Content

Viewers also liked

Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
Jaganadh Gopinadhan
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
Fasihul Kabir
 

Viewers also liked (8)

Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)
 
Basic NLP with Python and NLTK
Basic NLP with Python and NLTKBasic NLP with Python and NLTK
Basic NLP with Python and NLTK
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
 

Similar to Yahoo answers

SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptxSWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
McPoolMac
 
20. interview introduction, genome synthesis, vt commenting
20. interview introduction, genome synthesis, vt commenting20. interview introduction, genome synthesis, vt commenting
20. interview introduction, genome synthesis, vt commenting
IECP
 

Similar to Yahoo answers (20)

SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptxSWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
SWK_340_Powerpoint_Chapter_11._Survey_research_rDeCarloTextbook.pptx
 
Rss Oct 2011 Mixed Modes Pres2
Rss Oct 2011 Mixed Modes Pres2Rss Oct 2011 Mixed Modes Pres2
Rss Oct 2011 Mixed Modes Pres2
 
HCI_Lecture04.pptx
HCI_Lecture04.pptxHCI_Lecture04.pptx
HCI_Lecture04.pptx
 
Essential questions
Essential questionsEssential questions
Essential questions
 
Digital thinking
Digital thinkingDigital thinking
Digital thinking
 
Essential questions
Essential questionsEssential questions
Essential questions
 
FSU SLIS Wk2 Intro to Info Services: Reference Interview
FSU SLIS Wk2 Intro to Info Services: Reference InterviewFSU SLIS Wk2 Intro to Info Services: Reference Interview
FSU SLIS Wk2 Intro to Info Services: Reference Interview
 
Questionnaires1
Questionnaires1Questionnaires1
Questionnaires1
 
Level2 lesson2
Level2 lesson2Level2 lesson2
Level2 lesson2
 
Field research and interaction design: course #4
Field research and interaction design: course #4Field research and interaction design: course #4
Field research and interaction design: course #4
 
Avishkar
AvishkarAvishkar
Avishkar
 
ITC St. Pete 2017
ITC St. Pete 2017ITC St. Pete 2017
ITC St. Pete 2017
 
Inquiry based research2 ppt
Inquiry based research2 pptInquiry based research2 ppt
Inquiry based research2 ppt
 
Google is NOT a Verb
Google is NOT a VerbGoogle is NOT a Verb
Google is NOT a Verb
 
10 tips for a better UX survey
10 tips for a better UX survey10 tips for a better UX survey
10 tips for a better UX survey
 
Success Presentation For Teacher Only Day 2009
Success Presentation For Teacher Only Day 2009Success Presentation For Teacher Only Day 2009
Success Presentation For Teacher Only Day 2009
 
Making your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactfulMaking your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactful
 
Peter Flaschner - Bridging the Online/Offline Gap: How to Build, Engage, and ...
Peter Flaschner - Bridging the Online/Offline Gap: How to Build, Engage, and ...Peter Flaschner - Bridging the Online/Offline Gap: How to Build, Engage, and ...
Peter Flaschner - Bridging the Online/Offline Gap: How to Build, Engage, and ...
 
Using Surveys to Improve Your Library: Part 2 (Sept. 2018)
Using Surveys to Improve Your Library: Part 2 (Sept. 2018)Using Surveys to Improve Your Library: Part 2 (Sept. 2018)
Using Surveys to Improve Your Library: Part 2 (Sept. 2018)
 
20. interview introduction, genome synthesis, vt commenting
20. interview introduction, genome synthesis, vt commenting20. interview introduction, genome synthesis, vt commenting
20. interview introduction, genome synthesis, vt commenting
 

Recently uploaded

Recently uploaded (20)

Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Yahoo answers

  • 1. Yahoo! Answers Siju Varghese Engineer, Yahoo! Answers IIT Chennai, Mar’12
  • 2. Yahoo! Answers - background Community Question Answering (CQA) “Just because Google exists does not mean you Home > All Categories > Arts & Humanities > Philosophy should stop asking things”, A. Totsis, TechCrunch, 2010 • Largest Community Question Answering Site with more than 1B answers ( http://bit.ly/am1LwL ) • 4th most popular property on yahoo network • 3.5 answers/question, 11 minutes to first answer on an average. • Community Moderated, no editorial control. • Supports 12 langs. • Model: – Content-driven (different from Quora , which is social-driven) – Subjective quality model (different from WikiAnswers or StackOverflow) • By default broadcast questions to *all* potential users • Asker picks best answer (subjectivity rules): “Quality in the eye of the asker” even if the community disagrees
  • 3. A veritable gold mine of data… • User Interest Analysis – Seekers – Knowledge Experts • Sentiment Analysis
  • 5. Lets meet some Answerers
  • 9. • Problem: About 15% of incoming questions are unanswered • Key observation: Recurrent questions are prevalent in many categories – What are the symptoms of cervical cancer? – How much weight Should I lose? – How do you train your cat to use the litterbox? – What is a black hole? • Possible Solution: use past answered questions to automatically answer new questions.
  • 10. Why is it difficult? Title: • Non-informative, ambiguous title How often should i jump? • Complex information need Body: • Detailed and personal I have a 6yo cob, and we've started jumping, he's done it before, we're only jumping around 1-1,8 foot. He isn't fat • Multiple questions posed in one and he's quite fit. I jumped him on Friday, but did just 3 small jumps, i did flatwork on Saturday, and jumped him 3 • Grammatical errors, slang times today then hacked out, tomorrow, should i just flatwork him or would he be ok to jump a few small singles? Like 3? I • Non factual, rather opinion and won't ride him again until recommendation is expected Wednesday/Thursday to give him a break, what do you think? He is fine doing what he has done, he doesn't get • Extensive variability among sweaty much or tired, i myself am not questions: in language style, ready to go any higher and would like to cultural aspects, degree of detail just do maybe 1 or 2 jumps? Is he ok to jump tomorrow? If not when next? Thank you for the help! I appreciate it all! :) xx
  • 12. Findings • Robots have been answering for about a week, saturating their daily answering quota. • Significant fraction of the answers have been chosen as Best Answer – much better than an average user • Their responses elicited discussions, and they acquired several fans Learning from the Past: Answering New Questions with Past Answers” - Anna Shtok, Gideon Dror, Idan Szpektor and Yoelle Maarek http://www2012.wwwconference.org/program/accepted-papers/main-scientific-tracks/
  • 13. Get me a question that I can answer
  • 14. • Users want to answer new questions – No social information on such questions (item cold-start) – This known scenario in recommendation systems is our typical case • Many users are new – Hardly any answering history (user cold-start) – The majority of registered users • Current solution: show most recent questions in the category • Goal: a question recommendation model that fits all user types – Active users, – new users, – surfers
  • 15. Question Recommender • Recommender system approach, (like for movie recommendation but in a much larger and very sparse space) • Learn from past interactions with users and push relevant open questions I want to answer, who has a question? Yahoo! Answers Recommender System ” - G. Dror, Y. Koren, Y. Maarek and I. Szpektor, KDD’2011, San-Diego, CA
  • 17. • Why do people ask questions on Answers? – Users more likely to register with Y!A to ask things they can‘t ask elsewhere (“conversational”, ”personal”) – When you can’t find what you are looking for on www • Different (demographic) groups use it differently – Women ask more conversational questions – Older people ask more informational questions – Women are more sentimental when answering a question than men. – In terms of attitude, men are more neutral, whereas women have more positive attitude in their answers
  • 18. Credits: Giovanni Gardelli, Ingmar Weber, Antti Ukkonen, B. Barla Cambazoglu – Y! Research Barcelona Onur Kucuktunc, The Ohio State University, US Hakan Ferhatosmanoglu, Bilkent University, Turkey
  • 19. Some more… • Relating answers activity to activities around the world. – Huge drop in science & math category in mid- December every year. This category is heavily used for homework help -- effect of Christmas break? • ~ 35 small and big hacks during the 2011 Y! hack day event @ blr. – Android/iOS apps. • Location awareness. – Promote relevant Q&A on content pages. – Answers on other channels : Messgener, SMS etc. – Custom UI themes
  • 21. • Answers Data on the Grid • Answers Api’s – V1 : http://developer.yahoo.com/answers/ • Provides only read capability. – V2 (internal beta) • Offers full fledged capabilities – Read & Write. • Preview available for this HackU event.
  • 23. • Yahoo Research Labs – Haifa, Israel. Gideon Dror, Yehuda Koren, Yoelle Maarek, Dan Pelleg, Idan Szpektor, Oleg Rokhlenko • Yahoo! Research Labs, Barcelona Giovanni Gardelli, Ingmar Weber, Antti Ukkonen, B. Barla Cambazoglu • Yahoo! Answers Engineering, Bangalore

Editor's Notes

  1. Y! A still the largest community question answering site with 1B answers – less “fashionable” maybe than newcomers like Quora or less specialized than StackOverflow but does achieve its goal which is to satisfy askers whatever their intent is – side effect perceived “poor quality” – our approach since the quality is in the eye of the asker – simply don’t show to potential askers questions that don’t resonate with him – each set of questions has its own community – and quality is totally subjective Yahoo gets ~ 600 mil uu per month 117 billion per month views