SlideShare una empresa de Scribd logo
1 de 171
Descargar para leer sin conexión
Crowdsourcing for Information Retrieval:
 Principles, Methods, and Applications

                                                                                       Omar Alonso
                                                                                       Microsoft

                                                                                       Matthew Lease
                                                                                       University of Texas at Austin

                                                                                       July 28, 2011




 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications              1
Tutorial Objectives
• What is crowdsourcing? (!= MTurk)
• How and when to use crowdsourcing?
• How to use Mechanical Turk
• Experimental setup and design guidelines for
  working with the crowd
• Quality control: issues, measuring, and improving
• IR + Crowdsourcing
      – research landscape and open challenges
    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   2
Tutorial Outline
I. Introduction and Motivating Examples
II. Amazon Mechanical Turk (and CrowdFlower)
III. Relevance Judging and Crowdsourcing



IV. Design of experiments (the good stuff)
V. From Labels to Human Computation
VI. Worker Incentives (money isn’t everything)
VII.The Road Ahead (+ refs at end)
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   3
Terminology We’ll Cover
• Crowdsourcing: more than a buzzword?
   – What is and isn’t crowdsourcing?
   – Subset we discuss: micro-tasks (diagram coming)
• Human Computation = having people do stuff
   – Functional view of human work, both helpful & harmful
• AMT / MTurk
   – HIT, Requester, Assignment, Turking & Turkers
• Quality Control (QC)
   – spam & spammers
   – label aggregation, consensus, plurality, multi-labeling
   – “gold” data, honey pots, verifiable answers, trap questions
 July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   4
I
INTRODUCTION TO CROWDSOURCING

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   5
From Outsourcing to Crowdsourcing
• Take a job traditionally
  performed by a known agent
  (often an employee)
• Outsource it to an undefined,
  generally large group of
  people via an open call
• New application of principles
  from open source movement

 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   6
Community Q&A / Social Search /
                 Public Polling




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   7
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   8
Mechanical What?




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   9
Mechanical Turk (MTurk)




Chess machine constructed and
unveiled in 1770 by Wolfgang
von Kempelen (1734–1804)


        J. Pontin. Artificial Intelligence, With Help From
            the Humans. NY Times (March 25, 2007)
July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   10
•    “Micro-task” crowdsourcing marketplace
•    On-demand, scalable, real-time workforce
•    Online since 2005 (and still in “beta”)
•    Programmer’s API & “Dashboard” GUI
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   11
This isn’t just a lab toy…




    http://www.mturk-tracker.com (P. Ipeirotis’10)

From 1/09 – 4/10, 7M HITs from 10K requestors
worth $500,000 USD (significant under-estimate)
 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   12
Why Crowdsourcing for IR?
• Easy, cheap and fast labeling
• Ready-to use infrastructure
       – MTurk payments, workforce, interface widgets
       – CrowdFlower quality control mechanisms, etc.
• Allows early, iterative, frequent experiments
       – Iteratively prototype and test new ideas
       – Try new tasks, test when you want & as you go
• Proven in major IR shared task evaluations
       – CLEF image, TREC, INEX, WWW/Yahoo SemSearch
July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   13
Legal Disclaimer:
                 Caution Tape and Silver Bullets




• Often still involves more art than science
• Not a magic panacea, but another alternative
   – one more data point for analysis, complements other methods
• Quality may be sacrificed for time/cost/effort
• Hard work & experimental design still required!
 July 24, 2011      Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   14
Hello World Demo
• We’ll show a simple, short demo of MTurk
• This is a teaser highlighting things we’ll discuss
       – Don’t worry about details; we’ll revisit them
• Specific task unimportant
• Big idea: easy, fast, cheap to label with MTurk!




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   15
Jane saw the man with the binoculars




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   16
DEMO


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   17
Traditional Annotation / Data Collection
• Setup data collection software / harness
• Recruit volunteers (often undergrads)
• Pay a flat fee for experiment or hourly wage

• Characteristics
       –    Slow
       –    Expensive
       –    Tedious
       –    Sample Bias

July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   18
How about some real examples?
• Let’s see examples of MTurk’s use in prior
  studies (many areas!)
       – e.g. IR, NLP, computer vision, user studies, usability
         testing, psychological studies, surveys, …
• Check bibliography at end for more references




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   19
NLP Example – Dialect Identification




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   20
NLP Example – Spelling correction




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   21
NLP Example – Machine Translation
• Manual evaluation on translation quality is
  slow and expensive
• High agreement between non-experts and
  experts
• $0.10 to translate a sentence


    C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.

    B. Bederson et al. Translation by Iteractive Collaboration between Monolingual Users, GI 2010



July 24, 2011           Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                       22
Snow et al. (2008). EMNLP

• 5 Tasks
       –    Affect recognition
       –    Word similarity
       –    Recognizing textual entailment
       –    Event temporal ordering
       –    Word sense disambiguation
• high agreement between crowd
  labels and expert “gold” labels
       – assumes training data for worker bias correction
• 22K labels for $26 !
July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   23
CV Example – Painting Similarity




                                                      Kovashka & Lease, CrowdConf’10

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   24
IR Example – Relevance and ads




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   25
Okay, okay! I’m a believer!
       How can I get started with MTurk?
• You have an idea (e.g. novel IR technique)
• Hiring editors too difficult / expensive / slow
• You don’t have a large traffic query log

Can you test your idea via crowdsourcing?
• Is my idea crowdsourcable?
• How do I start?
• What do I need?
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   26
II
                AMAZON MECHANICAL TURK

July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   27
The Requester
•    Sign up with your Amazon account
•    Amazon payments
•    Purchase prepaid HITs
•    There is no minimum or up-front fee
•    MTurk collects a 10% commission
•    The minimum commission charge is $0.005 per HIT




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   28
Mturk Dashboard
• Three tabs
       – Design
       – Publish
       – Manage
• Design
       – HIT Template
• Publish
       – Make work available
• Manage
       – Monitor progress


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   29
Dashboard - II




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   30
API
•    Amazon Web Services API
•    Rich set of services
•    Command line tools
•    More flexibility than dashboard




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   31
Dashboard vs. API
• Dashboard
       – Easy to prototype
       – Setup and launch an experiment in a few minutes
• API
       – Ability to integrate AMT as part of a system
       – Ideal if you want to run experiments regularly
       – Schedule tasks


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   32
But where do my labels come from?
• An all powerful black box?




• A magical, faraway land?




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   33
Nope, MTurk has actual workers too!
• Sign up with your Amazon account
• Tabs
       – Account: work approved/rejected
       – HIT: browse and search for work
       – Qualifications: browse & search qualifications
• Start turking!



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   34
Doing some work
• Strongly recommended
• Do some work before you create work




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   35
But who are
my workers?


• A. Baio, November 2008. The Faces of Mechanical Turk.
• P. Ipeitorotis. March 2010. The New Demographics of
   Mechanical Turk
• J. Ross, et al. Who are the Crowdworkers?... CHI 2010.
 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   36
Worker Demographics
• 2008-2009 studies found
  less global and diverse
  than previously thought
       – US
       – Female
       – Educated
       – Bored
       – Money is secondary

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   37
2010 shows increasing diversity
47% US, 34% India, 19% other (P. Ipeitorotis. March 2010)




 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   38
Is MTurk my only choice? No, see below.
•     Crowdflower (since 2007, www.crowdflower.com)
•     CloudCrowd
•     DoMyStuff
•     Livework
•     Clickworker
•     SmartSheet
•     uTest
•     Elance
•     oDesk
•     vWorker (was rent-a-coder)
    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   39
(since 2007)
•   Labor on-demand
•   Channels
•   Quality control features
•   Sponsor: CSE’10, CSDM’11, CIR’11, TREC’11 Crowd Track




    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   40
High-level Issues in Crowdsourcing
• Process
       – Experimental design, annotation guidelines, iteration
• Choose crowdsourcing platform (or roll your own)
• Human factors
       – Payment / incentives, interface and interaction design,
         communication, reputation, recruitment, retention
• Quality Control / Data Quality
       – Trust, reliability, spam detection, consensus labeling


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   41
III
                RELEVANCE JUDGING & CROWDSOURING

July 24, 2011      Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   42
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   43
Relevance and IR
• What is relevance?
       – Multidimensional
       – Dynamic
       – Complex but systematic and measurable
• Relevance in Information Retrieval
• Frameworks
• Types
       –    System or algorithmic
       –    Topical
       –    Pertinence
       –    Situational
       –    Motivational


July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   44
Evaluation
• Relevance is hard to evaluate
       – Highly subjective
       – Expensive to measure
• Click data
• Professional editorial work
• Verticals



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   45
Crowdsourcing and Relevance Evaluation

• For relevance, it combines two main
  approaches
       – Explicit judgments
       – Automated metrics
• Other features
       – Large scale
       – Inexpensive
       – Diversity

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   46
User Studies
• Investigate attitudes about saving, sharing, publishing,
  and removing online photos
• Survey
       – A scenario-based probe of respondent attitudes, designed
         to yield quantitative data
       – A set of questions (close and open-ended)
       – Importance of recent activity
       – 41 question
       – 7 point scale
• 250 respondents

   C. Marshall and F. Shipman. “The Ownership and Reuse of Visual Media”, JCDL 2011.


July 24, 2011          Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   47
Elicitation Criteria
•     Relevance in a vertical like e-commerce
•     Is classical criteria right for e-commerce?
•     Classical criteria (Barry and Schamber)
        –    Accuracy & validity, consensus within the field, content
             novelty, depth & scope, presentation, recency, reliability,
             verifiability

•     E-commerce criteria
        –    Brand name, product name, price/value (cheap,
             affordable, expensive, not suspiciously cheap),
             availability, ratings & user reviews, latest model/version,
             personal aspects, perceived value, genre & age

•     Experiment
        –    Select e-C and non e-C queries

        –    Each workerr 1 query/need (e-C or non e-C)

        –    7 workers per HIT

O. Alonso and S. Mizzaro. “Relevance criteria for e-commerce: a crowdsourcing-based experimental analysis”, SIGIR 2009.

    July 24, 2011          Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                 48
IR Example – Product Search




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   49
IR Example – Snippet Evaluation
•    Study on summary lengths
•    Determine preferred result length
•    Asked workers to categorize web queries
•    Asked workers to evaluate snippet quality
•    Payment between $0.01 and $0.05 per HIT


    M. Kaisser, M. Hearst, and L. Lowe. “Improving Search Results Quality by Customizing Summary Lengths”, ACL/HLT, 2008.




July 24, 2011          Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                       50
IR Example – Relevance Assessment
•     Replace TREC-like relevance assessors with MTurk?
•     Selected topic “space program” (011)
•     Modified original 4-page instructions from TREC
•     Workers more accurate than original assessors!
•     40% provided justification for each answer


    O. Alonso and S. Mizzaro. “Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment”, SIGIR Workshop
    on the Future of IR Evaluation, 2009.



    July 24, 2011           Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                         51
IR Example – Timeline Annotation
• Workers annotate timeline on politics, sports, culture
• Given a timex (1970s, 1982, etc.) suggest something
• Given an event (Vietnam, World cup, etc.) suggest a timex




 K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”. ECIR 2010




 July 24, 2011         Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                       52
IR Example – Is Tweet Interesting?
• Detecting uninteresting content text streams
       – Alonso et al. SIGIR 2010 CSE Workshop.
• Is this tweet interesting to the author and
  friends only?
• Workers classify tweets
• 5 tweets per HIT, 5 workers, $0.02
• 57% is categorically not interesting


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   53
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   54
Started with a joke …




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   55
Results for {idiot} at WSDM
February 2011: 5/7 (R), 2/7 (NR)
    –    Most of the time those TV reality stars have absolutely no talent. They do whatever
         they can to make a quick dollar. Most of the time the reality tv stars don not have
         a mind of their own.   R
    –    Most are just celebrity wannabees. Many have little or no talent, they just want
         fame. R
    –    I can see this one going both ways. A particular sort of reality star comes to
         mind, though, one who was voted off Survivor because he chose not to use his
         immunity necklace. Sometimes the label fits, but sometimes it might be unfair. R
    –    Just because someone else thinks they are an "idiot", doesn't mean that is what the
         word means. I don't like to think that any one person's photo would be used to
         describe a certain term.   NR
    –    While some reality-television stars are genuinely stupid (or cultivate an image of
         stupidity), that does not mean they can or should be classified as "idiots." Some
         simply act that way to increase their TV exposure and potential earnings. Other
         reality-television stars are really intelligent people, and may be considered as
         idiots by people who don't like them or agree with them. It is too subjective an
         issue to be a good result for a search engine. NR
    –    Have you seen the knuckledraggers on reality television? They should be required to
         change their names to idiot after appearing on the show. You could put numbers
         after the word idiot so we can tell them apart. R
    –    Although I have not followed too many of these shows, those that I have encountered
         have for a great part a very common property. That property is that most of the
         participants involved exhibit a shallow self-serving personality that borders on
         social pathological behavior. To perform or act in such an abysmal way could only
         be an act of an idiot. R
 July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   56
Two Simple Examples of MTurk
1. Ask workers to classify a query
2. Ask workers to judge document relevance

Steps
• Define high-level task
• Design & implement interface & backend
• Launch, monitor progress, and assess work
• Iterate design

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   57
Query Classification Task
•    Ask the user to classify a query
•    Show a form that contains a few categories
•    Upload a few queries (~20)
•    Use 3 workers




July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   58
DEMO


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   59
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   60
Relevance Evaluation Task
•    Relevance assessment task
•    Use a few documents from TREC
•    Ask user to perform binary evaluation
•    Modification: graded evaluation
•    Use 5 workers




July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   61
DEMO


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   62
Typical Workflow
•    Define and design what to test
•    Sample data
•    Design the experiment
•    Run experiment
•    Collect data and analyze results
•    Quality control



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   63
Crowdsourcing in Major IR Evaluations
• CLEF image
      • Nowak and Ruger, MIR’10
• TREC blog
      • McCreadie et al., CSE’10, CSDM’11
• INEX book
      • Kazai et al., SIGIR’11
• SemSearch
      • Blanco et al., SIGIR’11



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   64
BREAK
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   65
IV
                              DESIGN OF EXPERIMENTS

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   66
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   67
Survey Design
•    One of the most important parts
•    Part art, part science
•    Instructions are key
•    Prepare to iterate




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   68
Questionnaire Design
• Ask the right questions
• Workers may not be IR experts so don’t
  assume the same understanding in terms of
  terminology
• Show examples
• Hire a technical writer
       – Engineer writes the specification
       – Writer communicates

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   69
UX Design
• Time to apply all those usability concepts
• Generic tips
       – Experiment should be self-contained.
       – Keep it short and simple. Brief and concise.
       – Be very clear with the relevance task.
       – Engage with the worker. Avoid boring stuff.
       – Always ask for feedback (open-ended question) in
         an input box.

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   70
UX Design - II
•    Presentation
•    Document design
•    Highlight important concepts
•    Colors and fonts
•    Need to grab attention
•    Localization



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   71
Examples - I
• Asking too much, task not clear, “do NOT/reject”
• Worker has to do a lot of stuff




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   72
Example - II
• Lot of work for a few cents
• Go here, go there, copy, enter, count …




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   73
A Better Example
• All information is available
       – What to do
       – Search result
       – Question to answer




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   74
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   75
Form and Metadata
• Form with a close question (binary relevance) and
  open-ended question (user feedback)
• Clear title, useful keywords
• Workers need to find your task




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   76
TREC Assessment – Example I




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   77
TREC Assessment – Example II




July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   78
How Much to Pay?
• Price commensurate with task effort
      – Ex: $0.02 for yes/no answer + $0.02 bonus for optional feedback
• Ethics & market-factors: W. Mason and S. Suri, 2010.
      – e.g. non-profit SamaSource contracts workers refugee camps
      – Predict right price given market & task: Wang et al. CSDM’11
• Uptake & time-to-completion vs. Cost & Quality
      – Too little $$, no interest or slow – too much $$, attract spammers
      – Real problem is lack of reliable QA substrate
• Accuracy & quantity
      – More pay = more work, not better (W. Mason and D. Watts, 2009)
• Heuristics: start small, watch uptake and bargaining feedback
• Worker retention (“anchoring”)


See also: L.B. Chilton et al. KDD-HCOMP 2010.
   July 24, 2011        Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   79
Development Framework
• Incremental approach
• Measure, evaluate, and adjust as you go
• Suitable for repeatable tasks




July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   80
Implementation
• Similar to a UX
• Build a mock up and test it with your team
       – Yes, you need to judge some tasks
• Incorporate feedback and run a test on MTurk
  with a very small data set
       – Time the experiment
       – Do people understand the task?
• Analyze results
       – Look for spammers
       – Check completion times
• Iterate and modify accordingly
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   81
Implementation – II
• Introduce quality control
       – Qualification test
       – Gold answers (honey pots)
•    Adjust passing grade and worker approval rate
•    Run experiment with new settings & same data
•    Scale on data
•    Scale on workers

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   82
Experiment in Production
•    Lots of tasks on MTurk at any moment
•    Need to grab attention
•    Importance of experiment metadata
•    When to schedule
       – Split a large task into batches and have 1 single
         batch in the system
       – Always review feedback from batch n before
         uploading n+1

July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   83
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   84
Quality Control
• Extremely important part of the experiment
• Approach as “overall” quality; not just for workers
• Bi-directional channel
   – You may think the worker is doing a bad job.
   – The same worker may think you are a lousy requester.




 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   85
Quality Control - II
• Approval rate: easy to use, & just as easily defeated
   – P. Ipeirotis. Be a Top Mechanical Turk Worker: You Need
     $5 and 5 Minutes. Oct. 2010
• Mechanical Turk Masters (June 23, 2011)
   – Very recent addition, amount of benefit uncertain
• Qualification test
   – Pre-screen workers’ ability to do the task (accurately)
   – Example and pros/cons in next slides
• Assess worker quality as you go
   – Trap questions with known answers (“honey pots”)
   – Measure inner-annotator agreement between workers
• No guarantees
 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   86
A qualification test snippet
<Question>
  <QuestionIdentifier>question1</QuestionIdentifier>
  <QuestionContent>
     <Text>Carbon monoxide poisoning is</Text>
  </QuestionContent>
  <AnswerSpecification>
     <SelectionAnswer>
         <StyleSuggestion>radiobutton</StyleSuggestion>
             <Selections>
              <Selection>
                <SelectionIdentifier>1</SelectionIdentifier>
                <Text>A chemical technique</Text>
              </Selection>
              <Selection>
                <SelectionIdentifier>2</SelectionIdentifier>
                <Text>A green energy treatment</Text>
              </Selection>
              <Selection>
                 <SelectionIdentifier>3</SelectionIdentifier>
                 <Text>A phenomena associated with sports</Text>
              </Selection>
              <Selection>
                 <SelectionIdentifier>4</SelectionIdentifier>
                 <Text>None of the above</Text>
              </Selection>
             </Selections>
     </SelectionAnswer>
  </AnswerSpecification>
  July 24, 2011
</Question>            Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   87
Qualification tests: pros and cons
• Advantages
       – Great tool for controlling quality
       – Adjust passing grade
• Disadvantages
       –    Extra cost to design and implement the test
       –    May turn off workers, hurt completion time
       –    Refresh the test on a regular basis
       –    Hard to verify subjective tasks like judging relevance
• Try creating task-related questions to get worker
  familiar with task before starting task in earnest
July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   88
Methods for measuring agreement
• What to look for
       – Agreement, reliability, validity
• Inter-agreement level
       – Agreement between judges
       – Agreement between judges and the gold set
• Some statistics
       –    Percentage agreement
       –    Cohen’s kappa (2 raters)
       –    Fleiss’ kappa (any number of raters)
       –    Krippendorff’s alpha
• With majority vote, what if 2 say relevant, 3 say not?
       – Use expert to break ties (Kochhar et al, HCOMP’10; GQR)
       – Collect more judgments as needed to reduce uncertainty
July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   89
Inter-rater reliability
• Lots of research
• Statistics books cover most of the material
• Three categories based on the goals
       – Consensus estimates
       – Consistency estimates
       – Measurement estimates




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   90
Quality control on relevance assessments

•    INEX 2008 Book track
•    Home grown system (no MTurk)
•    Propose a game for collecting assessments
•    CRA Method



    G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance
    Assessments”, SIGIR 2009.



July 24, 2011           Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                             91
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   92
Quality Control & Assurance
• Filtering
     –   Approval rate (built-in but defeatable)
     –   Geographic restrictions (e.g. US only, built-in)
     –   Worker blocking
     –   Qualification test
            • Con: slows down experiment, difficult to “test” relevance
            • Solution: create questions to let user get familiar before the assessment
     – Does not guarantee success
• Assessing quality
     – Interject verifiable/gold answers (trap questions, honey pots)
            •   P. Ipeitotis. Worker Evaluation in Crowdsourcing: Gold Data or Multiple Workers? Sept. 2010.

     – 2-tier approach: Group 1 does task, Group 2 verifies
            • Quinn and B. Bederson’09, Bernstein et al.’10
• Identify workers that always disagree with the majority
     – Risk: masking cases of ambiguity or diversity, “tail” behaviors
July 24, 2011          Crowdsourcing for Information Retrieval: Principles, Methods, and Applications          93
More on quality control & assurance
• HR issues: recruiting, selection, & retention
       – e.g., post/tweet, design a better qualification test,
         bonuses, …
• Collect more redundant judgments…
       – at some point defeats cost savings of crowdsourcing
       – 5 workers is often sufficient
• Use better aggregation method
       – Voting
       – Consensus
       – Averaging
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   94
Data quality
   • Data quality via repeated labeling
   • Repeated labeling can improve label quality
     and model quality
   • When labels are noisy, repeated labeling can
     preferable to a single labeling
   • Cost issues with labeling

V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers” KDD 2008.




   July 24, 2011           Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                         95
Scales and labels
• Binary
     – Yes, No
• 5-point Likert
     – Strongly disagree, disagree, neutral, agree, strongly agree
• Graded relevance:
     –   DCG: Irrelevant, marginally, fairly, highly (Jarvelin, 2000)
     –   TREC: Highly relevant, relevant, (related), not relevant
     –   Yahoo/MS: Perfect, excellent, good, fair, bad (PEGFB)
     –   The Google Quality Raters Handbook (March 2008)
     –   0 to 10 (0 = totally irrelevant, 10 = most relevant)
• Usability factors
     – Provide clear, concise labels that use plain language
     – Avoid unfamiliar jargon and terminology
 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   96
Was the task difficult?
• Ask workers to rate the difficulty of a topic
• 50 topics, TREC; 5 workers, $0.01 per task




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   97
Other quality heuristics
• Justification/feedback as quasi-captcha
       – Successfully used at TREC and INEX experiments
       – Should be optional
       – Automatically verifying feedback was written by a
         person may be difficult (classic spam detection task)
• Broken URL/incorrect object
       – Leave an outlier in the data set
       – Workers will tell you
       – If somebody answers “excellent” on a graded
         relevance test for a broken URL => probably spammer

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   98
MTurk QA: Tools and Packages
• QA infrastructure layers atop MTurk promote
  useful separation-of-concerns from task
       – TurkIt
                • Quik Turkit provides nearly realtime services
       –    Turkit-online (??)
       –    Get Another Label (& qmturk)
       –    Turk Surveyor
       –    cv-web-annotation-toolkit (image labeling)
       –    Soylent
       –    Boto (python library)
                • Turkpipe: submit batches of jobs using the command line.
• More needed…
July 24, 2011        Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   99
Dealing with bad workers
• Pay for “bad” work instead of rejecting it?
   – Pro: preserve reputation, admit if poor design at fault
   – Con: promote fraud, undermine approval rating system
• Use bonus as incentive
   – Pay the minimum $0.01 and $0.01 for bonus
   – Better than rejecting a $0.02 task
• If spammer “caught”, block from future tasks
   – May be easier to always pay, then block as needed

 July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   100
Worker feedback
• Real feedback received via email after rejection
• Worker XXX
     I did. If you read these articles most of them have
     nothing to do with space programs. I’m not an idiot.

• Worker XXX
     As far as I remember there wasn't an explanation about
     what to do when there is no name in the text. I believe I
     did write a few comments on that, too. So I think you're
     being unfair rejecting my HITs.




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   101
Real email exchange with worker after rejection
WORKER: this is not fair , you made me work for 10 cents and i lost my 30 minutes
of time ,power and lot more and gave me 2 rejections atleast you may keep it
pending. please show some respect to turkers

REQUESTER: I'm sorry about the rejection. However, in the directions given in the
hit, we have the following instructions: IN ORDER TO GET PAID, you must judge all 5
webpages below *AND* complete a minimum of three HITs.

Unfortunately, because you only completed two hits, we had to reject those hits.
We do this because we need a certain amount of data on which to make decisions
about judgment quality. I'm sorry if this caused any distress. Feel free to contact me
if you have any additional questions or concerns.

WORKER: I understood the problems. At that time my kid was crying and i went to
look after. that's why i responded like that. I was very much worried about a hit
being rejected. The real fact is that i haven't seen that instructions of 5 web page
and started doing as i do the dolores labs hit, then someone called me and i went
to attend that call. sorry for that and thanks for your kind concern.
  July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   102
Exchange with worker
•    Worker XXX
     Thank you. I will post positive feedback for you at
     Turker Nation.

Me: was this a sarcastic comment?

•    I took a chance by accepting some of your HITs to see if
     you were a trustworthy author. My experience with you
     has been favorable so I will put in a good word for you
     on that website. This will help you get higher quality
     applicants in the future, which will provide higher
     quality work, which might be worth more to you, which
     hopefully means higher HIT amounts in the future.




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   103
Build Your Reputation as a Requestor
• Word of mouth effect
       – Workers trust the requester (pay on time, clear
         explanation if there is a rejection)
       – Experiments tend to go faster
       – Announce forthcoming tasks (e.g. tweet)
• Disclose your real identity?



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   104
Other practical tips
• Sign up as worker and do some HITs
• “Eat your own dog food”
• Monitor discussion forums
• Address feedback (e.g., poor guidelines,
  payments, passing grade, etc.)
• Everything counts!
       – Overall design only as strong as weakest link


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   105
Content quality
• People like to work on things that they like
• TREC ad-hoc vs. INEX
       – TREC experiments took twice to complete
       – INEX (Wikipedia), TREC (LA Times, FBIS)
• Topics
       – INEX: Olympic games, movies, salad recipes, etc.
       – TREC: cosmic events, Schengen agreement, etc.
• Content and judgments according to modern times
       – Airport security docs are pre 9/11
       – Antarctic exploration (global warming )

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   106
Content quality - II
• Document length
• Randomize content
• Avoid worker fatigue
       – Judging 100 documents on the same subject can
         be tiring, leading to decreasing quality




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   107
Presentation
• People scan documents for relevance cues
• Document design
• Highlighting no more than 10%




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   108
Presentation - II




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   109
Relevance justification
• Why settle for a label?
• Let workers justify answers
• INEX
       – 22% of assignments with comments
• Must be optional
• Let’s see how people justify



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   110
“Relevant” answers
 [Salad Recipes]
 Doesn't mention the word 'salad', but the recipe is one that could be considered a
    salad, or a salad topping, or a sandwich spread.
 Egg salad recipe
 Egg salad recipe is discussed.
 History of salad cream is discussed.
 Includes salad recipe
 It has information about salad recipes.
 Potato Salad
 Potato salad recipes are listed.
 Recipe for a salad dressing.
 Salad Recipes are discussed.
 Salad cream is discussed.
 Salad info and recipe
 The article contains a salad recipe.
 The article discusses methods of making potato salad.
 The recipe is for a dressing for a salad, so the information is somewhat narrow for
    the topic but is still potentially relevant for a researcher.
 This article describes a specific salad. Although it does not list a specific recipe,
    it does contain information relevant to the search topic.
 gives a recipe for tuna salad
 relevant for tuna salad recipes
 relevant to salad recipes
 this is on-topic for salad recipes




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   111
“Not relevant” answers
[Salad Recipes]
About gaming not salad recipes.
Article is about Norway.
Article is about Region Codes.
Article is about forests.
Article is about geography.
Document is about forest and trees.
Has nothing to do with salad or recipes.
Not a salad recipe
Not about recipes
Not about salad recipes
There is no recipe, just a comment on how salads fit into meal formats.
There is nothing mentioned about salads.
While dressings should be mentioned with salads, this is an article on one specific
    type of dressing, no recipe for salads.
article about a swiss tv show
completely off-topic for salad recipes
not a salad recipe
not about salad recipes
totally off base



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   112
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   113
Feedback length

• Workers will justify answers
• Has to be optional for good
  feedback
• In E51, mandatory comments
  – Length dropped
  – “Relevant” or “Not Relevant



  July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   114
Other design principles
• Text alignment
• Legibility
• Reading level: complexity of words and sentences
• Attractiveness (worker’s attention & enjoyment)
• Multi-cultural / multi-lingual
• Who is the audience (e.g. target worker community)
       – Special needs communities (e.g. simple color blindness)
• Parsimony
• Cognitive load: mental rigor needed to perform task
• Exposure effect
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   115
Platform alternatives
• Why MTurk
       – Amazon brand, lots of research papers
       – Speed, price, diversity, payments
• Why not
       – Crowdsourcing != Mturk
       – Spam, no analytics, must build tools for worker & task quality
• How to build your own crowdsourcing platform
       –    Back-end
       –    Template language for creating experiments
       –    Scheduler
       –    Payments?


July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   116
The human side
• As a worker
     –   I hate when instructions are not clear
     –   I’m not a spammer – I just don’t get what you want
     –   Boring task
     –   A good pay is ideal but not the only condition for engagement
• As a requester
     – Attrition
     – Balancing act: a task that would produce the right results and
       is appealing to workers
     – I want your honest answer for the task
     – I want qualified workers; system should do some of that for me
• Managing crowds and tasks is a daily activity
     – more difficult than managing computers
 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   117
Things that work
•    Qualification tests
•    Honey-pots
•    Good content and good presentation
•    Economy of attention
•    Things to improve
       – Manage workers in different levels of expertise
         including spammers and potential cases.
       – Mix different pools of workers based on different
         profile and expertise levels.

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   118
Things that need work
• UX and guidelines
       – Help the worker
       – Cost of interaction
•    Scheduling and refresh rate
•    Exposure effect
•    Sometimes we just don’t agree
•    How crowdsourcable is your task

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   119
V
FROM LABELING TO HUMAN COMPUTATION
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   120
The Turing Test (Alan Turing, 1950)




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   121
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   122
The Turing Test (Alan Turing, 1950)




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   123
What is a Computer?




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   124
• What was old becomes new
• “Crowdsourcing: A New
  Branch of Computer Science”
  (March 29, 2011)




                         Princeton University Press, 2005

 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   125
Davis et al. (2010) The HPU.




                                                              HPU




July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   126
Human Computation
Rebirth of people as ‘computists’; people do tasks computers cannot (do well)
Stage 1: Detecting robots
       – CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart
       – No useful work produced; people just answer questions with known answers

Stage 2: Labeling data (at scale)
       – E.g. ESP game, typical use of MTurk
       – Game changer for AI: starving for data

Stage 3: General “human computation” (HPU)
       – people do arbitrarily sophisticated tasks (i.e. compute arbitrary functions)
       – HPU as core component in system architecture, many “HPC” invocations
       – blend HPU with automation for a new class of hybrid applications
       – New tradeoffs possible in latency/cost vs. functionality/accuracy
L. von Ahn has pioneered the field. See bibliography for examples of his work.
   July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   127
Mobile Phone App: “Amazon Remembers”




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   128
ReCaptcha




L. von Ahn et al. (2008). In Science.
Harnesses human work as invisible by product.
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   129
CrowdSearch and mCrowd
• T. Yan, MobiSys 2010




July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   130
Soylent: A Word Processor with a Crowd Inside

 • Bernstein et al., UIST 2010




 July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   131
Translation by monolingual speakers
• C. Hu, CHI 2009




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   132
fold.it
• S. Cooper et al. (2010).




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   133
VI
                                                        WORKER INCENTIVES
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   134
Worker Incentives
•    Pay ($$$)
•    Fun (or avoid boredom)
•    Socialize
•    Earn acclaim/prestige
•    Altruism
•    Learn something new (e.g. English)
•    Unintended by-product (e.g. re-Captcha)
•    Create self-serving resource (e.g. Wikipedia)

Multiple incentives are typically at work in parallel
    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   135
Pay ($$$)


                                                             P. Ipeirotis March 2010




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   136
Pro
• Ready marketplaces (e.g. MTurk, CrowdFlower, …)
• Less need for creativity
• Simple motivation knob

Con: Quality and Quality control required
• Can diminish intrinsic rewards that promote quality:
    – Fun/altruistic value of task
    – Taking pride in doing quality work
                                                                                               Pay ($$$)
    – Self-assessment
• Can attract workers only interested in the pay, fraud
• $$$ (though other schemes cost indirectly)

How much to pay?
• Mason & Watts 2009: more $ = more work, not better work
• Wang et al. 2011: predict from market?
• More later…

Zittrain 2010: if Encarta had paid for contributions, would we have Wikipedia?
   July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   137
Fun (or avoid boredom)
• Games with a Purpose (von Ahn)
       – Data is by-product
       – IR: Law et al. SearchWar. HCOMP 2009.



• distinct from Serious Gaming / Edutainment
       – Player learning / training / education is by-product



July 24, 2011      Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   138
•    Learning to map from web pages to queries
   •    Human computation game to elicit data
   •    Home grown system (no AMT)
   •    Try it!
             pagehunt.msrlivelabs.com


See also:
• H. Ma. et al. “Improving Search Engines Using Human Computation Games”, CIKM 2009.
• Law et al. SearchWar. HCOMP 2009.
• Bennett et al. Picture This. HCOMP 2009.
   July 24, 2011      Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   139
Fun (or avoid boredom)
• Pro:
   – Enjoyable “work” people want to do (or at least better
     than anything else they have to do)
   – Scalability potential from involving non-workers
• Con:
   – Need for design creativity
           • some would say this is a plus
           • better performance in game should produce better/more work
   – Some tasks more amenable than others
           • Annotating syntactic parse trees for fun?
           • Inferring syntax implicitly from a different activity?
 July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   140
Socialization & Prestige




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   141
Socialization & Prestige
• Pro:
       – “free”
       – enjoyable for connecting with one another
       – can share infrastructure across tasks
• Con:
       – need infrastructure beyond simple micro-task
       – need critical mass (for uptake and reward)
       – social engineering knob more complex than $$

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   142
Altruism
•     Contribute knowledge
•     Help others (who need knowledge)
•     Help workers (e.g. SamaSource)
•     Charity (e.g. http://www.freerice.com)




    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   143
Altruism
• Pro
       – “free”
       – can motivate quality work for a cause
• Con
       – Seemingly small workforce for pure altruism

What if Mechanical Turk let you donate $$ per HIT?


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   144
Unintended by-product
• Pro
       – effortless (unnoticed) work
       – Scalability from involving non-workers
• Con
       – Design challenge
                • Given existing activity, find useful work to harness from it
                • Given target work, find or create another activity for
                  which target work is by-product?
       – Maybe too invisible (disclosure, manipulation)
July 24, 2011        Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   145
Multiple Incentives
• Ideally maximize all
• Wikipedia, cQA, Gwap
       – fun, socialization, prestige, altruism
• Fun vs. Pay
       – gwap gives Amazon certificates
       – Workers maybe paid in game currency
       – Pay tasks can also be fun themselves
• Pay-based
       – Other rewards: e.g. learn something, socialization
       – altruism: worker (e.g. SamaSource) or task itself
       – social network integration could help everyone
         (currently separate and lacking structure)
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   146
VII
                                                                       THE ROAD AHEAD
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   147
Wisdom of Crowds (WoC)
Requires
• Diversity
• Independence
• Decentralization
• Aggregation

Input: large, diverse sample
     (to increase likelihood of overall pool quality)
Output: consensus or selection (aggregation)
July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   148
WoC vs. Ensemble Learning
• Combine multiple models to improve performance
  over any constituent model
   – Can use many weak learners to make a strong one
   – Compensate for poor models with extra computation
• Works better with diverse, independent learners
• cf. NIPS’10 Workshop
   – Computational Social Science & the Wisdom of Crowds
• More investigation needed of traditional feature-
  based machine learning & ensemble methods for
  consensus labeling with crowdsourcing
  July 24, 2011     Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   149
Unreasonable Effectiveness of Data
• Massive free Web data
  changed how we train
  learning systems
  – Banko and Brill (2001).
    Human Language Tech.
  – Halevy et al. (2009). IEEE
    Intelligent Systems.

 • How might access to cheap & plentiful labeled
   data change the balance again?
  July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   150
MapReduce with human computation
• Commonalities
       – Large task divided into smaller sub-problems
       – Work distributed among worker nodes (workers)
       – Collect all answers and combine them
       – Varying performance of heterogeneous
         CPUs/HPUs
• Variations
       – Human response latency / size of “cluster”
       – Some tasks are not suitable

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   151
CrowdForge: MapReduce for
      Automation + Human Computation




        Kittur et al., CHI 2011


July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   152
Research problems – operational
• Methodology
       – Budget, people, document, queries, presentation,
         incentives, etc.
       – Scheduling
       – Quality
• What’s the best “mix” of HC for a task?
• What are the tasks suitable for HC?
• Can I crowdsource my task?
       – Eickhoff and de Vries, WSDM 2011 CSDM Workshop

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   153
More problems
• Human factors vs. outcomes
• Editors vs. workers
• Pricing tasks
• Predicting worker quality from observable
  properties (e.g. task completion time)
• HIT / Requestor ranking or recommendation
• Expert search : who are the right workers given
  task nature and constraints
• Ensemble methods for Crowd Wisdom consensus
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   154
Problems: crowds, clouds and algorithms
• Infrastructure
     – Current platforms are very rudimentary
     – No tools for data analysis
• Dealing with uncertainty (propagate rather than mask)
     –    Temporal and labeling uncertainty
     –    Learning algorithms
     –    Search evaluation
     –    Active learning (which example is likely to be labeled correctly)
• Combining CPU + HPU
     – Human Remote Call?
     – Procedural vs. declarative?
     – Integration points with enterprise systems
 July 24, 2011    Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   155
Conclusions
•    Crowdsourcing for relevance evaluation works
•    Fast turnaround, easy to experiment, cheap
•    Still have to design the experiments carefully!
•    Usability considerations
•    Worker quality
•    User feedback extremely useful



July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   156
Conclusions - II
•    Crowdsourcing is here to stay
•    Lots of opportunities to improve current platforms
•    Integration with current systems
•    MTurk is a popular platform and others are emerging
•    Open research problems




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   157
VIII
                                        RESOURCES AND REFERENCES
July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   158
Little written for the general public
• July 2010, kindle-only
• “This book introduces you
  to the top crowdsourcing
  sites and outlines step by
  step with photos the exact
  process to get started as a
  requester on Amazon
  Mechanical Turk.“

July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   159
Crowdsourcing @ SIGIR’11
   Workshop on Crowdsourcing for Information Retrieval


   Roi Blanco, Harry Halpin, Daniel Herzig, Peter Mika, Jeffrey Pound, Henry Thompson, Thanh D. Tran. “Repeatable and
    Reliable Search System Evaluation using Crowd-Sourcing”.
   Yen-Ta Huang, An-Jung Cheng, Liang-Chi Hsieh, Winston H. Hsu, Kuo-Wei Chang. “Region-Based Landmark Discovery by
    Crowdsourcing Geo-Referenced Photos.” Poster.
   Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling. “Crowdsourcing for Book Search Evaluation: Impact
    of Quality on Comparative System Ranking.”
   Abhimanu Kumar, Matthew Lease . “Learning to Rank From a Noisy Crowd”. Poster.
   Edith Law, Paul N. Bennett, and Eric Horvitz. “The Effects of Choice in Routing Relevance Judgments”. Poster.




     July 24, 2011        Crowdsourcing for Information Retrieval: Principles, Methods, and Applications            160
2011 Workshops & Conferences
SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28)

• WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9)
• CHI-CHC: Crowdsourcing and Human Computation (May 8)
•   Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13)
•   Crowdsourcing Technologies for Language and Cognition Studies (July 27)
•   2011 AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8)
•   UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18)
•   CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”)
•   CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2)
•   TREC-Crowd: Crowdsourcing Track at TREC (Nov. 16-18)
•   ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2)

    July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   161
2011 Tutorials
• SIGIR (yep, this is it!)
• WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You
    – Omar Alonso and Matthew Lease (Feb. 9)
• WWW: Managing Crowdsourced Human Computation
    – Panos Ipeirotis and Praveen Paritosh (March 29)
• HCIC: Quality Crowdsourcing for Human Computer Interaction Research
    – Ed Chi (June 14-18)
    – Also see Chi’s Crowdsourcing for HCI Research with Amazon Mechanical Turk
• AAAI: Human Computation: Core Research Questions and State of the Art
    – Edith Law and Luis von Ahn (Aug. 7)
• VLDB: Crowdsourcing Applications and Platforms
    – AnHai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska (Aug. 29)
• CrowdConf: Crowdsourcing for Fun and Profit
    – Omar Alonso and Matthew Lease (Nov. 1)
   July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   162
Other Events & Resources
                  ir.ischool.utexas.edu/crowd

2011 book: Omar Alonso, Gabriella Kazai, and
Stefano Mizzaro. Crowdsourcing for Search Engine
Evaluation: Why and How.

Forthcoming special issue of Springer’s Information
Retrieval journal on Crowdsourcing


  July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   163
Thank You!

For questions about tutorial or crowdsourcing, email:
  omar.alonso@microsoft.com
  ml@ischool.utexas.edu



Cartoons by Mateo Burtch (buta@sonic.net)




July 24, 2011   Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   164
Crowdsourcing in IR: 2008-2010
   2008
          O. Alonso, D. Rose, and B. Stewart. “Crowdsourcing for relevance evaluation”, SIGIR Forum, Vol. 42, No. 2.

   2009
          O. Alonso and S. Mizzaro. “Can we get rid of TREC Assessors? Using Mechanical Turk for … Assessment”. SIGIR Workshop on the Future of IR Evaluation.
          P.N. Bennett, D.M. Chickering, A. Mityagin. Learning Consensus Opinion: Mining Data from a Labeling Game. WWW.
          G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments”, SIGIR.
          G. Kazai and N. Milic-Frayling. “… Quality of Relevance Assessments Collected through Crowdsourcing”. SIGIR Workshop on the Future of IR Evaluation.
          Law et al. “SearchWar”. HCOMP.
          H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. “Improving Search Engines Using Human Computation Games”, CIKM 2009.

   2010
          SIGIR Workshop on Crowdsourcing for Search Evaluation.
          O. Alonso, R. Schenkel, and M. Theobald. “Crowdsourcing Assessments for XML Ranked Retrieval”, ECIR.
          K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”, ECIR.
          C. Grady and M. Lease. “Crowdsourcing Document Relevance Assessment with Mechanical Turk”. NAACL HLT Workshop on … Amazon's Mechanical Turk.
          Grace Hui Yang, Anton Mityagin, Krysta M. Svore, and Sergey Markov . “Collecting High Quality Overlapping Labels at Low Cost”. SIGIR.
          G. Kazai. “An Exploration of the Influence that Task Parameters Have on the Performance of Crowds”. CrowdConf.
          G. Kazai. “… Crowdsourcing in Building an Evaluation Platform for Searching Collections of Digitized Books”., Workshop on Very Large Digital Libraries (VLDL)
          Stephanie Nowak and Stefan Ruger. How Reliable are Annotations via Crowdsourcing? MIR.
          Jean-François Paiement, Dr. James G. Shanahan, and Remi Zajac. “Crowdsourcing Local Search Relevance”. CrowdConf.
          Maria Stone and Omar Alonso. “A Comparison of On-Demand Workforce with Trained Judges for Web Search Relevance Evaluation”. CrowdConf.
          T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys pp. 77--90, 2010.




     July 24, 2011                Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                                               165
Crowdsourcing in IR: 2011
   WSDM Workshop on Crowdsourcing for Search and Data Mining.
   SIGIR Workshop on Crowdsourcing for Information Retrieval.
   SIGIR papers/posters mentioned earlier


   O. Alonso and R. Baeza-Yates. “Design and Implementation of Relevance Assessments using Crowdsourcing, ECIR.
   G. Kasneci, J. Van Gael, D. Stern, and T. Graepel, CoBayes: Bayesian Knowledge Corroboration with Assessors of
    Unknown Areas of Expertise, WSDM.
   Hyun Joon Jung, Matthew Lease . “Improving Consensus Accuracy via Z-score and Weighted Voting”. HCOMP. Poster.

   Gabriella Kazai,. “In Search of Quality in Crowdsourcing for Search Engine Evaluation”, ECIR.




     July 24, 2011         Crowdsourcing for Information Retrieval: Principles, Methods, and Applications       166
Bibliography: General IR
   M. Hearst. “Search User Interfaces”, Cambridge University Press, 2009
   K. Jarvelin, and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. Proceedings of the 23rd annual
    international ACM SIGIR conference . pp.41—48, 2000.
   M. Kaisser, M. Hearst, and L. Lowe. “Improving Search Results Quality by Customizing Summary Lengths”, ACL/HLT, 2008.
   D. Kelly. “Methods for evaluating interactive information retrieval systems with users”. Foundations and Trends in Information
    Retrieval, 3(1-2), 1-224, 2009.
   S. Mizzaro. Measuring the agreement among relevance judges, MIRA 1999
   J. Tang and M. Sanderson. “Evaluation and User Preference Study on Spatial Diversity”, ECIR 2010




     July 24, 2011          Crowdsourcing for Information Retrieval: Principles, Methods, and Applications               167
Bibliography: Other
   J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.
   Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.
   Bederson, B.B., Hu, C., & Resnik, P. Translation by Iteractive Collaboration between Monolingual Users, Proceedings of Graphics
    Interface (GI 2010), 39-46.
   N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.
   C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.
   P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.
   J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human
    in the Loop (ACVHL), June 2010.
   M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.
   D. A. Grief. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579
   JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.
   J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.
   P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.
   J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.
   P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT
    Workshop on Active Learning and NLP, 2009.
   B. Huberman, D. Romero, and F. Wu. “Crowdsouring, attention and productivity”. Journal of Information Science, 2009.
   P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.
   P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.
   P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010)

     July 24, 2011           Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                    168
Bibliography: Other (2)
    A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.
    Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011
    Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.
    K. Krippendorff. "Content Analysis", Sage Publications, 2003
    G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.
    T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence.
     2009.
    W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.
    J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.
    A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009
    J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting
     Demographics in Amazon Mechanical Turk”. CHI 2010.
    F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.
    R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations
     for Natural Language Tasks”. EMNLP-2008.
    V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers”
     KDD 2008.
    S. Weber. “The Success of Open Source”, Harvard University Press, 2004.
    L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.
    L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008.

July 24, 2011         Crowdsourcing for Information Retrieval: Principles, Methods, and Applications                 169
Bibliography: Other (3)
    C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.
    AnHai Doan, Raghu Ramakrishnan, Alon Y. Halevy: Crowdsourcing systems on the World-Wide Web. CACM, 2011
    Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011.




July 24, 2011       Crowdsourcing for Information Retrieval: Principles, Methods, and Applications       170
Other Resources
Blogs
 Behind Enemy Lines (P.G. Ipeirotis, NYU)
 Deneme: a Mechanical Turk experiments blog (Gret Little, MIT)
 CrowdFlower Blog
 http://experimentalturk.wordpress.com
 Jeff Howe

Sites
 The Crowdsortium
 Crowdsourcing.org
 CrowdsourceBase (for workers)
 Daily Crowdsource

MTurk Forums and Resources
 Turker Nation: http://turkers.proboards.com
 http://www.turkalert.com (and its blog)
 Turkopticon: report/avoid shady requestors
 Amazon Forum for MTurk




July 24, 2011       Crowdsourcing for Information Retrieval: Principles, Methods, and Applications   171

Más contenido relacionado

La actualidad más candente

Dark web by Claudine Impas
Dark web by Claudine ImpasDark web by Claudine Impas
Dark web by Claudine ImpasClaudine Impas
 
The Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxThe Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxMark Carrigan
 
Online dating presentation
Online dating presentationOnline dating presentation
Online dating presentationVijay Thapa
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptxSrinivas237938
 
Deep dive into ChatGPT
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPTvaluebound
 
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORK
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORKTHE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORK
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORKAbdul Razaq
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With PythonRobert Dempsey
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsDan O'Leary
 
Chatbots - The Business Opportunity
Chatbots - The Business OpportunityChatbots - The Business Opportunity
Chatbots - The Business OpportunityAlexandros Ivos
 
Chat GPT - A Game Changer in Education
Chat GPT - A Game Changer in EducationChat GPT - A Game Changer in Education
Chat GPT - A Game Changer in EducationThiyagu K
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learngnakan
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningMykola Dobrochynskyy
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisFabio Benedetti
 
advantages and disadvantages of internet
advantages and disadvantages of internetadvantages and disadvantages of internet
advantages and disadvantages of internetAli Şahin
 

La actualidad más candente (20)

Children and future internet
Children and future internetChildren and future internet
Children and future internet
 
Dark web by Claudine Impas
Dark web by Claudine ImpasDark web by Claudine Impas
Dark web by Claudine Impas
 
Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
 
Model bias in AI
Model bias in AIModel bias in AI
Model bias in AI
 
The Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxThe Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptx
 
Online dating presentation
Online dating presentationOnline dating presentation
Online dating presentation
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptx
 
Deep dive into ChatGPT
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPT
 
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORK
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORKTHE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORK
THE IMPACT OF DIGITAL COMMUNICATION ON SOCIAL NETWORK
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
AI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science ConceptsAI, Machine Learning, and Data Science Concepts
AI, Machine Learning, and Data Science Concepts
 
Chatbots - The Business Opportunity
Chatbots - The Business OpportunityChatbots - The Business Opportunity
Chatbots - The Business Opportunity
 
Chat GPT - A Game Changer in Education
Chat GPT - A Game Changer in EducationChat GPT - A Game Changer in Education
Chat GPT - A Game Changer in Education
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learn
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment Analysis
 
Deep web Seminar
Deep web Seminar Deep web Seminar
Deep web Seminar
 
advantages and disadvantages of internet
advantages and disadvantages of internetadvantages and disadvantages of internet
advantages and disadvantages of internet
 
Internet addiction in brain
Internet addiction in brainInternet addiction in brain
Internet addiction in brain
 

Destacado

Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMatthew Lease
 
Discovering and Navigating Memes in Social Media
Discovering and Navigating Memes in Social MediaDiscovering and Navigating Memes in Social Media
Discovering and Navigating Memes in Social MediaMatthew Lease
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Matthew Lease
 
On implicatures, pragmatic enrichment and explicatures
On implicatures, pragmatic enrichment and explicaturesOn implicatures, pragmatic enrichment and explicatures
On implicatures, pragmatic enrichment and explicaturesLouis de Saussure
 
Statistical Information Retrieval Modelling: from the Probability Ranking Pr...
Statistical Information Retrieval Modelling:  from the Probability Ranking Pr...Statistical Information Retrieval Modelling:  from the Probability Ranking Pr...
Statistical Information Retrieval Modelling: from the Probability Ranking Pr...Jun Wang
 
Information Retrieval Models Part I
Information Retrieval Models Part IInformation Retrieval Models Part I
Information Retrieval Models Part IIngo Frommholz
 
Dynamic Information Retrieval Tutorial - SIGIR 2015
Dynamic Information Retrieval Tutorial - SIGIR 2015Dynamic Information Retrieval Tutorial - SIGIR 2015
Dynamic Information Retrieval Tutorial - SIGIR 2015Marc Sloan
 
Humanizing The Machine
Humanizing The MachineHumanizing The Machine
Humanizing The MachineCrowdFlower
 
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Matthew Lease
 
Relevant information and decision making
Relevant information and decision makingRelevant information and decision making
Relevant information and decision makingAliza Racelis
 
10 tips for Successful Crowdsourcing
10 tips for Successful Crowdsourcing10 tips for Successful Crowdsourcing
10 tips for Successful CrowdsourcingJW Alphenaar
 
Preparing #EdCampSantiago 2013: Looking @ Dialogue
Preparing #EdCampSantiago 2013: Looking @ Dialogue Preparing #EdCampSantiago 2013: Looking @ Dialogue
Preparing #EdCampSantiago 2013: Looking @ Dialogue Baker Publishing Company
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applicationsSimon Willison
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsGUANBO
 
IRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersIRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersSriTeja Allaparthi
 
Mining Product Synonyms - Slides
Mining Product Synonyms - SlidesMining Product Synonyms - Slides
Mining Product Synonyms - SlidesAnkush Jain
 

Destacado (20)

Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not Anonymous
 
Discovering and Navigating Memes in Social Media
Discovering and Navigating Memes in Social MediaDiscovering and Navigating Memes in Social Media
Discovering and Navigating Memes in Social Media
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
 
On implicatures, pragmatic enrichment and explicatures
On implicatures, pragmatic enrichment and explicaturesOn implicatures, pragmatic enrichment and explicatures
On implicatures, pragmatic enrichment and explicatures
 
Naomi's Final PPT
Naomi's Final PPTNaomi's Final PPT
Naomi's Final PPT
 
Statistical Information Retrieval Modelling: from the Probability Ranking Pr...
Statistical Information Retrieval Modelling:  from the Probability Ranking Pr...Statistical Information Retrieval Modelling:  from the Probability Ranking Pr...
Statistical Information Retrieval Modelling: from the Probability Ranking Pr...
 
FLIRT crowdsourcing
FLIRT crowdsourcingFLIRT crowdsourcing
FLIRT crowdsourcing
 
Information Retrieval Models Part I
Information Retrieval Models Part IInformation Retrieval Models Part I
Information Retrieval Models Part I
 
Dynamic Information Retrieval Tutorial - SIGIR 2015
Dynamic Information Retrieval Tutorial - SIGIR 2015Dynamic Information Retrieval Tutorial - SIGIR 2015
Dynamic Information Retrieval Tutorial - SIGIR 2015
 
Humanizing The Machine
Humanizing The MachineHumanizing The Machine
Humanizing The Machine
 
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
 
Relevant information and decision making
Relevant information and decision makingRelevant information and decision making
Relevant information and decision making
 
10 tips for Successful Crowdsourcing
10 tips for Successful Crowdsourcing10 tips for Successful Crowdsourcing
10 tips for Successful Crowdsourcing
 
Preparing #EdCampSantiago 2013: Looking @ Dialogue
Preparing #EdCampSantiago 2013: Looking @ Dialogue Preparing #EdCampSantiago 2013: Looking @ Dialogue
Preparing #EdCampSantiago 2013: Looking @ Dialogue
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applications
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical Models
 
IRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research PapersIRE- Algorithm Name Detection in Research Papers
IRE- Algorithm Name Detection in Research Papers
 
Mining Product Synonyms - Slides
Mining Product Synonyms - SlidesMining Product Synonyms - Slides
Mining Product Synonyms - Slides
 

Similar a Crowdsourcing for Information Retrieval: Principles, Methods, and Applications

Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsMatthew Lease
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsApplause
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBig Data Value Association
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBig Data Value Association
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
 
De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012Anna De Liddo
 
Reflections from a realist evaluation in progress: Scaling ladders and stitch...
Reflections from a realist evaluation in progress: Scaling ladders and stitch...Reflections from a realist evaluation in progress: Scaling ladders and stitch...
Reflections from a realist evaluation in progress: Scaling ladders and stitch...Debbie_at_IDS
 
Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdveLuis Daniel Ibáñez
 
Philips john huffman
Philips john huffmanPhilips john huffman
Philips john huffmanBigDataExpo
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 
The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...Amanda Brady
 
datamining_Lecture_1(introduction).pptx
datamining_Lecture_1(introduction).pptxdatamining_Lecture_1(introduction).pptx
datamining_Lecture_1(introduction).pptxHASHEMHASH
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadooplucenerevolution
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 

Similar a Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (20)

Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI Applications
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big Data
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big Data
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
 
De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012
 
Reflections from a realist evaluation in progress: Scaling ladders and stitch...
Reflections from a realist evaluation in progress: Scaling ladders and stitch...Reflections from a realist evaluation in progress: Scaling ladders and stitch...
Reflections from a realist evaluation in progress: Scaling ladders and stitch...
 
Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdve
 
Philips john huffman
Philips john huffmanPhilips john huffman
Philips john huffman
 
Ba digital
Ba digitalBa digital
Ba digital
 
Webinar Next Week: Beyond Online Intake: Looking at Triage and Expert Systems
Webinar Next Week:  Beyond Online Intake: Looking at Triage and Expert SystemsWebinar Next Week:  Beyond Online Intake: Looking at Triage and Expert Systems
Webinar Next Week: Beyond Online Intake: Looking at Triage and Expert Systems
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...
 
datamining_Lecture_1(introduction).pptx
datamining_Lecture_1(introduction).pptxdatamining_Lecture_1(introduction).pptx
datamining_Lecture_1(introduction).pptx
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoop
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
KTDRR Measuring for Impact_Peter Levesque
KTDRR Measuring for Impact_Peter LevesqueKTDRR Measuring for Impact_Peter Levesque
KTDRR Measuring for Impact_Peter Levesque
 

Más de Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Matthew Lease
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd Matthew Lease
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?Matthew Lease
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016Matthew Lease
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)Matthew Lease
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing ScienceMatthew Lease
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
 

Más de Matthew Lease (20)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing Science
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
 

Último

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Crowdsourcing for Information Retrieval: Principles, Methods, and Applications

  • 1. Crowdsourcing for Information Retrieval: Principles, Methods, and Applications Omar Alonso Microsoft Matthew Lease University of Texas at Austin July 28, 2011 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 1
  • 2. Tutorial Objectives • What is crowdsourcing? (!= MTurk) • How and when to use crowdsourcing? • How to use Mechanical Turk • Experimental setup and design guidelines for working with the crowd • Quality control: issues, measuring, and improving • IR + Crowdsourcing – research landscape and open challenges July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 2
  • 3. Tutorial Outline I. Introduction and Motivating Examples II. Amazon Mechanical Turk (and CrowdFlower) III. Relevance Judging and Crowdsourcing IV. Design of experiments (the good stuff) V. From Labels to Human Computation VI. Worker Incentives (money isn’t everything) VII.The Road Ahead (+ refs at end) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 3
  • 4. Terminology We’ll Cover • Crowdsourcing: more than a buzzword? – What is and isn’t crowdsourcing? – Subset we discuss: micro-tasks (diagram coming) • Human Computation = having people do stuff – Functional view of human work, both helpful & harmful • AMT / MTurk – HIT, Requester, Assignment, Turking & Turkers • Quality Control (QC) – spam & spammers – label aggregation, consensus, plurality, multi-labeling – “gold” data, honey pots, verifiable answers, trap questions July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 4
  • 5. I INTRODUCTION TO CROWDSOURCING July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 5
  • 6. From Outsourcing to Crowdsourcing • Take a job traditionally performed by a known agent (often an employee) • Outsource it to an undefined, generally large group of people via an open call • New application of principles from open source movement July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 6
  • 7. Community Q&A / Social Search / Public Polling July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 7
  • 8. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 8
  • 9. Mechanical What? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 9
  • 10. Mechanical Turk (MTurk) Chess machine constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804) J. Pontin. Artificial Intelligence, With Help From the Humans. NY Times (March 25, 2007) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 10
  • 11. “Micro-task” crowdsourcing marketplace • On-demand, scalable, real-time workforce • Online since 2005 (and still in “beta”) • Programmer’s API & “Dashboard” GUI July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 11
  • 12. This isn’t just a lab toy… http://www.mturk-tracker.com (P. Ipeirotis’10) From 1/09 – 4/10, 7M HITs from 10K requestors worth $500,000 USD (significant under-estimate) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 12
  • 13. Why Crowdsourcing for IR? • Easy, cheap and fast labeling • Ready-to use infrastructure – MTurk payments, workforce, interface widgets – CrowdFlower quality control mechanisms, etc. • Allows early, iterative, frequent experiments – Iteratively prototype and test new ideas – Try new tasks, test when you want & as you go • Proven in major IR shared task evaluations – CLEF image, TREC, INEX, WWW/Yahoo SemSearch July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 13
  • 14. Legal Disclaimer: Caution Tape and Silver Bullets • Often still involves more art than science • Not a magic panacea, but another alternative – one more data point for analysis, complements other methods • Quality may be sacrificed for time/cost/effort • Hard work & experimental design still required! July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 14
  • 15. Hello World Demo • We’ll show a simple, short demo of MTurk • This is a teaser highlighting things we’ll discuss – Don’t worry about details; we’ll revisit them • Specific task unimportant • Big idea: easy, fast, cheap to label with MTurk! July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 15
  • 16. Jane saw the man with the binoculars July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 16
  • 17. DEMO July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 17
  • 18. Traditional Annotation / Data Collection • Setup data collection software / harness • Recruit volunteers (often undergrads) • Pay a flat fee for experiment or hourly wage • Characteristics – Slow – Expensive – Tedious – Sample Bias July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 18
  • 19. How about some real examples? • Let’s see examples of MTurk’s use in prior studies (many areas!) – e.g. IR, NLP, computer vision, user studies, usability testing, psychological studies, surveys, … • Check bibliography at end for more references July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 19
  • 20. NLP Example – Dialect Identification July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 20
  • 21. NLP Example – Spelling correction July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 21
  • 22. NLP Example – Machine Translation • Manual evaluation on translation quality is slow and expensive • High agreement between non-experts and experts • $0.10 to translate a sentence C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009. B. Bederson et al. Translation by Iteractive Collaboration between Monolingual Users, GI 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 22
  • 23. Snow et al. (2008). EMNLP • 5 Tasks – Affect recognition – Word similarity – Recognizing textual entailment – Event temporal ordering – Word sense disambiguation • high agreement between crowd labels and expert “gold” labels – assumes training data for worker bias correction • 22K labels for $26 ! July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 23
  • 24. CV Example – Painting Similarity Kovashka & Lease, CrowdConf’10 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 24
  • 25. IR Example – Relevance and ads July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 25
  • 26. Okay, okay! I’m a believer! How can I get started with MTurk? • You have an idea (e.g. novel IR technique) • Hiring editors too difficult / expensive / slow • You don’t have a large traffic query log Can you test your idea via crowdsourcing? • Is my idea crowdsourcable? • How do I start? • What do I need? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 26
  • 27. II AMAZON MECHANICAL TURK July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 27
  • 28. The Requester • Sign up with your Amazon account • Amazon payments • Purchase prepaid HITs • There is no minimum or up-front fee • MTurk collects a 10% commission • The minimum commission charge is $0.005 per HIT July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 28
  • 29. Mturk Dashboard • Three tabs – Design – Publish – Manage • Design – HIT Template • Publish – Make work available • Manage – Monitor progress July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 29
  • 30. Dashboard - II July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 30
  • 31. API • Amazon Web Services API • Rich set of services • Command line tools • More flexibility than dashboard July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 31
  • 32. Dashboard vs. API • Dashboard – Easy to prototype – Setup and launch an experiment in a few minutes • API – Ability to integrate AMT as part of a system – Ideal if you want to run experiments regularly – Schedule tasks July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 32
  • 33. But where do my labels come from? • An all powerful black box? • A magical, faraway land? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 33
  • 34. Nope, MTurk has actual workers too! • Sign up with your Amazon account • Tabs – Account: work approved/rejected – HIT: browse and search for work – Qualifications: browse & search qualifications • Start turking! July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 34
  • 35. Doing some work • Strongly recommended • Do some work before you create work July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 35
  • 36. But who are my workers? • A. Baio, November 2008. The Faces of Mechanical Turk. • P. Ipeitorotis. March 2010. The New Demographics of Mechanical Turk • J. Ross, et al. Who are the Crowdworkers?... CHI 2010. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 36
  • 37. Worker Demographics • 2008-2009 studies found less global and diverse than previously thought – US – Female – Educated – Bored – Money is secondary July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 37
  • 38. 2010 shows increasing diversity 47% US, 34% India, 19% other (P. Ipeitorotis. March 2010) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 38
  • 39. Is MTurk my only choice? No, see below. • Crowdflower (since 2007, www.crowdflower.com) • CloudCrowd • DoMyStuff • Livework • Clickworker • SmartSheet • uTest • Elance • oDesk • vWorker (was rent-a-coder) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 39
  • 40. (since 2007) • Labor on-demand • Channels • Quality control features • Sponsor: CSE’10, CSDM’11, CIR’11, TREC’11 Crowd Track July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 40
  • 41. High-level Issues in Crowdsourcing • Process – Experimental design, annotation guidelines, iteration • Choose crowdsourcing platform (or roll your own) • Human factors – Payment / incentives, interface and interaction design, communication, reputation, recruitment, retention • Quality Control / Data Quality – Trust, reliability, spam detection, consensus labeling July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 41
  • 42. III RELEVANCE JUDGING & CROWDSOURING July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 42
  • 43. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 43
  • 44. Relevance and IR • What is relevance? – Multidimensional – Dynamic – Complex but systematic and measurable • Relevance in Information Retrieval • Frameworks • Types – System or algorithmic – Topical – Pertinence – Situational – Motivational July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 44
  • 45. Evaluation • Relevance is hard to evaluate – Highly subjective – Expensive to measure • Click data • Professional editorial work • Verticals July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 45
  • 46. Crowdsourcing and Relevance Evaluation • For relevance, it combines two main approaches – Explicit judgments – Automated metrics • Other features – Large scale – Inexpensive – Diversity July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 46
  • 47. User Studies • Investigate attitudes about saving, sharing, publishing, and removing online photos • Survey – A scenario-based probe of respondent attitudes, designed to yield quantitative data – A set of questions (close and open-ended) – Importance of recent activity – 41 question – 7 point scale • 250 respondents C. Marshall and F. Shipman. “The Ownership and Reuse of Visual Media”, JCDL 2011. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 47
  • 48. Elicitation Criteria • Relevance in a vertical like e-commerce • Is classical criteria right for e-commerce? • Classical criteria (Barry and Schamber) – Accuracy & validity, consensus within the field, content novelty, depth & scope, presentation, recency, reliability, verifiability • E-commerce criteria – Brand name, product name, price/value (cheap, affordable, expensive, not suspiciously cheap), availability, ratings & user reviews, latest model/version, personal aspects, perceived value, genre & age • Experiment – Select e-C and non e-C queries – Each workerr 1 query/need (e-C or non e-C) – 7 workers per HIT O. Alonso and S. Mizzaro. “Relevance criteria for e-commerce: a crowdsourcing-based experimental analysis”, SIGIR 2009. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 48
  • 49. IR Example – Product Search July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 49
  • 50. IR Example – Snippet Evaluation • Study on summary lengths • Determine preferred result length • Asked workers to categorize web queries • Asked workers to evaluate snippet quality • Payment between $0.01 and $0.05 per HIT M. Kaisser, M. Hearst, and L. Lowe. “Improving Search Results Quality by Customizing Summary Lengths”, ACL/HLT, 2008. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 50
  • 51. IR Example – Relevance Assessment • Replace TREC-like relevance assessors with MTurk? • Selected topic “space program” (011) • Modified original 4-page instructions from TREC • Workers more accurate than original assessors! • 40% provided justification for each answer O. Alonso and S. Mizzaro. “Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment”, SIGIR Workshop on the Future of IR Evaluation, 2009. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 51
  • 52. IR Example – Timeline Annotation • Workers annotate timeline on politics, sports, culture • Given a timex (1970s, 1982, etc.) suggest something • Given an event (Vietnam, World cup, etc.) suggest a timex K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”. ECIR 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 52
  • 53. IR Example – Is Tweet Interesting? • Detecting uninteresting content text streams – Alonso et al. SIGIR 2010 CSE Workshop. • Is this tweet interesting to the author and friends only? • Workers classify tweets • 5 tweets per HIT, 5 workers, $0.02 • 57% is categorically not interesting July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 53
  • 54. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 54
  • 55. Started with a joke … July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 55
  • 56. Results for {idiot} at WSDM February 2011: 5/7 (R), 2/7 (NR) – Most of the time those TV reality stars have absolutely no talent. They do whatever they can to make a quick dollar. Most of the time the reality tv stars don not have a mind of their own. R – Most are just celebrity wannabees. Many have little or no talent, they just want fame. R – I can see this one going both ways. A particular sort of reality star comes to mind, though, one who was voted off Survivor because he chose not to use his immunity necklace. Sometimes the label fits, but sometimes it might be unfair. R – Just because someone else thinks they are an "idiot", doesn't mean that is what the word means. I don't like to think that any one person's photo would be used to describe a certain term. NR – While some reality-television stars are genuinely stupid (or cultivate an image of stupidity), that does not mean they can or should be classified as "idiots." Some simply act that way to increase their TV exposure and potential earnings. Other reality-television stars are really intelligent people, and may be considered as idiots by people who don't like them or agree with them. It is too subjective an issue to be a good result for a search engine. NR – Have you seen the knuckledraggers on reality television? They should be required to change their names to idiot after appearing on the show. You could put numbers after the word idiot so we can tell them apart. R – Although I have not followed too many of these shows, those that I have encountered have for a great part a very common property. That property is that most of the participants involved exhibit a shallow self-serving personality that borders on social pathological behavior. To perform or act in such an abysmal way could only be an act of an idiot. R July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 56
  • 57. Two Simple Examples of MTurk 1. Ask workers to classify a query 2. Ask workers to judge document relevance Steps • Define high-level task • Design & implement interface & backend • Launch, monitor progress, and assess work • Iterate design July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 57
  • 58. Query Classification Task • Ask the user to classify a query • Show a form that contains a few categories • Upload a few queries (~20) • Use 3 workers July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 58
  • 59. DEMO July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 59
  • 60. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 60
  • 61. Relevance Evaluation Task • Relevance assessment task • Use a few documents from TREC • Ask user to perform binary evaluation • Modification: graded evaluation • Use 5 workers July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 61
  • 62. DEMO July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 62
  • 63. Typical Workflow • Define and design what to test • Sample data • Design the experiment • Run experiment • Collect data and analyze results • Quality control July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 63
  • 64. Crowdsourcing in Major IR Evaluations • CLEF image • Nowak and Ruger, MIR’10 • TREC blog • McCreadie et al., CSE’10, CSDM’11 • INEX book • Kazai et al., SIGIR’11 • SemSearch • Blanco et al., SIGIR’11 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 64
  • 65. BREAK July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 65
  • 66. IV DESIGN OF EXPERIMENTS July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 66
  • 67. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 67
  • 68. Survey Design • One of the most important parts • Part art, part science • Instructions are key • Prepare to iterate July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 68
  • 69. Questionnaire Design • Ask the right questions • Workers may not be IR experts so don’t assume the same understanding in terms of terminology • Show examples • Hire a technical writer – Engineer writes the specification – Writer communicates July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 69
  • 70. UX Design • Time to apply all those usability concepts • Generic tips – Experiment should be self-contained. – Keep it short and simple. Brief and concise. – Be very clear with the relevance task. – Engage with the worker. Avoid boring stuff. – Always ask for feedback (open-ended question) in an input box. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 70
  • 71. UX Design - II • Presentation • Document design • Highlight important concepts • Colors and fonts • Need to grab attention • Localization July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 71
  • 72. Examples - I • Asking too much, task not clear, “do NOT/reject” • Worker has to do a lot of stuff July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 72
  • 73. Example - II • Lot of work for a few cents • Go here, go there, copy, enter, count … July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 73
  • 74. A Better Example • All information is available – What to do – Search result – Question to answer July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 74
  • 75. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 75
  • 76. Form and Metadata • Form with a close question (binary relevance) and open-ended question (user feedback) • Clear title, useful keywords • Workers need to find your task July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 76
  • 77. TREC Assessment – Example I July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 77
  • 78. TREC Assessment – Example II July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 78
  • 79. How Much to Pay? • Price commensurate with task effort – Ex: $0.02 for yes/no answer + $0.02 bonus for optional feedback • Ethics & market-factors: W. Mason and S. Suri, 2010. – e.g. non-profit SamaSource contracts workers refugee camps – Predict right price given market & task: Wang et al. CSDM’11 • Uptake & time-to-completion vs. Cost & Quality – Too little $$, no interest or slow – too much $$, attract spammers – Real problem is lack of reliable QA substrate • Accuracy & quantity – More pay = more work, not better (W. Mason and D. Watts, 2009) • Heuristics: start small, watch uptake and bargaining feedback • Worker retention (“anchoring”) See also: L.B. Chilton et al. KDD-HCOMP 2010. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 79
  • 80. Development Framework • Incremental approach • Measure, evaluate, and adjust as you go • Suitable for repeatable tasks July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 80
  • 81. Implementation • Similar to a UX • Build a mock up and test it with your team – Yes, you need to judge some tasks • Incorporate feedback and run a test on MTurk with a very small data set – Time the experiment – Do people understand the task? • Analyze results – Look for spammers – Check completion times • Iterate and modify accordingly July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 81
  • 82. Implementation – II • Introduce quality control – Qualification test – Gold answers (honey pots) • Adjust passing grade and worker approval rate • Run experiment with new settings & same data • Scale on data • Scale on workers July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 82
  • 83. Experiment in Production • Lots of tasks on MTurk at any moment • Need to grab attention • Importance of experiment metadata • When to schedule – Split a large task into batches and have 1 single batch in the system – Always review feedback from batch n before uploading n+1 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 83
  • 84. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 84
  • 85. Quality Control • Extremely important part of the experiment • Approach as “overall” quality; not just for workers • Bi-directional channel – You may think the worker is doing a bad job. – The same worker may think you are a lousy requester. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 85
  • 86. Quality Control - II • Approval rate: easy to use, & just as easily defeated – P. Ipeirotis. Be a Top Mechanical Turk Worker: You Need $5 and 5 Minutes. Oct. 2010 • Mechanical Turk Masters (June 23, 2011) – Very recent addition, amount of benefit uncertain • Qualification test – Pre-screen workers’ ability to do the task (accurately) – Example and pros/cons in next slides • Assess worker quality as you go – Trap questions with known answers (“honey pots”) – Measure inner-annotator agreement between workers • No guarantees July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 86
  • 87. A qualification test snippet <Question> <QuestionIdentifier>question1</QuestionIdentifier> <QuestionContent> <Text>Carbon monoxide poisoning is</Text> </QuestionContent> <AnswerSpecification> <SelectionAnswer> <StyleSuggestion>radiobutton</StyleSuggestion> <Selections> <Selection> <SelectionIdentifier>1</SelectionIdentifier> <Text>A chemical technique</Text> </Selection> <Selection> <SelectionIdentifier>2</SelectionIdentifier> <Text>A green energy treatment</Text> </Selection> <Selection> <SelectionIdentifier>3</SelectionIdentifier> <Text>A phenomena associated with sports</Text> </Selection> <Selection> <SelectionIdentifier>4</SelectionIdentifier> <Text>None of the above</Text> </Selection> </Selections> </SelectionAnswer> </AnswerSpecification> July 24, 2011 </Question> Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 87
  • 88. Qualification tests: pros and cons • Advantages – Great tool for controlling quality – Adjust passing grade • Disadvantages – Extra cost to design and implement the test – May turn off workers, hurt completion time – Refresh the test on a regular basis – Hard to verify subjective tasks like judging relevance • Try creating task-related questions to get worker familiar with task before starting task in earnest July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 88
  • 89. Methods for measuring agreement • What to look for – Agreement, reliability, validity • Inter-agreement level – Agreement between judges – Agreement between judges and the gold set • Some statistics – Percentage agreement – Cohen’s kappa (2 raters) – Fleiss’ kappa (any number of raters) – Krippendorff’s alpha • With majority vote, what if 2 say relevant, 3 say not? – Use expert to break ties (Kochhar et al, HCOMP’10; GQR) – Collect more judgments as needed to reduce uncertainty July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 89
  • 90. Inter-rater reliability • Lots of research • Statistics books cover most of the material • Three categories based on the goals – Consensus estimates – Consistency estimates – Measurement estimates July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 90
  • 91. Quality control on relevance assessments • INEX 2008 Book track • Home grown system (no MTurk) • Propose a game for collecting assessments • CRA Method G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments”, SIGIR 2009. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 91
  • 92. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 92
  • 93. Quality Control & Assurance • Filtering – Approval rate (built-in but defeatable) – Geographic restrictions (e.g. US only, built-in) – Worker blocking – Qualification test • Con: slows down experiment, difficult to “test” relevance • Solution: create questions to let user get familiar before the assessment – Does not guarantee success • Assessing quality – Interject verifiable/gold answers (trap questions, honey pots) • P. Ipeitotis. Worker Evaluation in Crowdsourcing: Gold Data or Multiple Workers? Sept. 2010. – 2-tier approach: Group 1 does task, Group 2 verifies • Quinn and B. Bederson’09, Bernstein et al.’10 • Identify workers that always disagree with the majority – Risk: masking cases of ambiguity or diversity, “tail” behaviors July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 93
  • 94. More on quality control & assurance • HR issues: recruiting, selection, & retention – e.g., post/tweet, design a better qualification test, bonuses, … • Collect more redundant judgments… – at some point defeats cost savings of crowdsourcing – 5 workers is often sufficient • Use better aggregation method – Voting – Consensus – Averaging July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 94
  • 95. Data quality • Data quality via repeated labeling • Repeated labeling can improve label quality and model quality • When labels are noisy, repeated labeling can preferable to a single labeling • Cost issues with labeling V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers” KDD 2008. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 95
  • 96. Scales and labels • Binary – Yes, No • 5-point Likert – Strongly disagree, disagree, neutral, agree, strongly agree • Graded relevance: – DCG: Irrelevant, marginally, fairly, highly (Jarvelin, 2000) – TREC: Highly relevant, relevant, (related), not relevant – Yahoo/MS: Perfect, excellent, good, fair, bad (PEGFB) – The Google Quality Raters Handbook (March 2008) – 0 to 10 (0 = totally irrelevant, 10 = most relevant) • Usability factors – Provide clear, concise labels that use plain language – Avoid unfamiliar jargon and terminology July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 96
  • 97. Was the task difficult? • Ask workers to rate the difficulty of a topic • 50 topics, TREC; 5 workers, $0.01 per task July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 97
  • 98. Other quality heuristics • Justification/feedback as quasi-captcha – Successfully used at TREC and INEX experiments – Should be optional – Automatically verifying feedback was written by a person may be difficult (classic spam detection task) • Broken URL/incorrect object – Leave an outlier in the data set – Workers will tell you – If somebody answers “excellent” on a graded relevance test for a broken URL => probably spammer July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 98
  • 99. MTurk QA: Tools and Packages • QA infrastructure layers atop MTurk promote useful separation-of-concerns from task – TurkIt • Quik Turkit provides nearly realtime services – Turkit-online (??) – Get Another Label (& qmturk) – Turk Surveyor – cv-web-annotation-toolkit (image labeling) – Soylent – Boto (python library) • Turkpipe: submit batches of jobs using the command line. • More needed… July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 99
  • 100. Dealing with bad workers • Pay for “bad” work instead of rejecting it? – Pro: preserve reputation, admit if poor design at fault – Con: promote fraud, undermine approval rating system • Use bonus as incentive – Pay the minimum $0.01 and $0.01 for bonus – Better than rejecting a $0.02 task • If spammer “caught”, block from future tasks – May be easier to always pay, then block as needed July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 100
  • 101. Worker feedback • Real feedback received via email after rejection • Worker XXX I did. If you read these articles most of them have nothing to do with space programs. I’m not an idiot. • Worker XXX As far as I remember there wasn't an explanation about what to do when there is no name in the text. I believe I did write a few comments on that, too. So I think you're being unfair rejecting my HITs. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 101
  • 102. Real email exchange with worker after rejection WORKER: this is not fair , you made me work for 10 cents and i lost my 30 minutes of time ,power and lot more and gave me 2 rejections atleast you may keep it pending. please show some respect to turkers REQUESTER: I'm sorry about the rejection. However, in the directions given in the hit, we have the following instructions: IN ORDER TO GET PAID, you must judge all 5 webpages below *AND* complete a minimum of three HITs. Unfortunately, because you only completed two hits, we had to reject those hits. We do this because we need a certain amount of data on which to make decisions about judgment quality. I'm sorry if this caused any distress. Feel free to contact me if you have any additional questions or concerns. WORKER: I understood the problems. At that time my kid was crying and i went to look after. that's why i responded like that. I was very much worried about a hit being rejected. The real fact is that i haven't seen that instructions of 5 web page and started doing as i do the dolores labs hit, then someone called me and i went to attend that call. sorry for that and thanks for your kind concern. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 102
  • 103. Exchange with worker • Worker XXX Thank you. I will post positive feedback for you at Turker Nation. Me: was this a sarcastic comment? • I took a chance by accepting some of your HITs to see if you were a trustworthy author. My experience with you has been favorable so I will put in a good word for you on that website. This will help you get higher quality applicants in the future, which will provide higher quality work, which might be worth more to you, which hopefully means higher HIT amounts in the future. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 103
  • 104. Build Your Reputation as a Requestor • Word of mouth effect – Workers trust the requester (pay on time, clear explanation if there is a rejection) – Experiments tend to go faster – Announce forthcoming tasks (e.g. tweet) • Disclose your real identity? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 104
  • 105. Other practical tips • Sign up as worker and do some HITs • “Eat your own dog food” • Monitor discussion forums • Address feedback (e.g., poor guidelines, payments, passing grade, etc.) • Everything counts! – Overall design only as strong as weakest link July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 105
  • 106. Content quality • People like to work on things that they like • TREC ad-hoc vs. INEX – TREC experiments took twice to complete – INEX (Wikipedia), TREC (LA Times, FBIS) • Topics – INEX: Olympic games, movies, salad recipes, etc. – TREC: cosmic events, Schengen agreement, etc. • Content and judgments according to modern times – Airport security docs are pre 9/11 – Antarctic exploration (global warming ) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 106
  • 107. Content quality - II • Document length • Randomize content • Avoid worker fatigue – Judging 100 documents on the same subject can be tiring, leading to decreasing quality July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 107
  • 108. Presentation • People scan documents for relevance cues • Document design • Highlighting no more than 10% July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 108
  • 109. Presentation - II July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 109
  • 110. Relevance justification • Why settle for a label? • Let workers justify answers • INEX – 22% of assignments with comments • Must be optional • Let’s see how people justify July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 110
  • 111. “Relevant” answers [Salad Recipes] Doesn't mention the word 'salad', but the recipe is one that could be considered a salad, or a salad topping, or a sandwich spread. Egg salad recipe Egg salad recipe is discussed. History of salad cream is discussed. Includes salad recipe It has information about salad recipes. Potato Salad Potato salad recipes are listed. Recipe for a salad dressing. Salad Recipes are discussed. Salad cream is discussed. Salad info and recipe The article contains a salad recipe. The article discusses methods of making potato salad. The recipe is for a dressing for a salad, so the information is somewhat narrow for the topic but is still potentially relevant for a researcher. This article describes a specific salad. Although it does not list a specific recipe, it does contain information relevant to the search topic. gives a recipe for tuna salad relevant for tuna salad recipes relevant to salad recipes this is on-topic for salad recipes July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 111
  • 112. “Not relevant” answers [Salad Recipes] About gaming not salad recipes. Article is about Norway. Article is about Region Codes. Article is about forests. Article is about geography. Document is about forest and trees. Has nothing to do with salad or recipes. Not a salad recipe Not about recipes Not about salad recipes There is no recipe, just a comment on how salads fit into meal formats. There is nothing mentioned about salads. While dressings should be mentioned with salads, this is an article on one specific type of dressing, no recipe for salads. article about a swiss tv show completely off-topic for salad recipes not a salad recipe not about salad recipes totally off base July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 112
  • 113. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 113
  • 114. Feedback length • Workers will justify answers • Has to be optional for good feedback • In E51, mandatory comments – Length dropped – “Relevant” or “Not Relevant July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 114
  • 115. Other design principles • Text alignment • Legibility • Reading level: complexity of words and sentences • Attractiveness (worker’s attention & enjoyment) • Multi-cultural / multi-lingual • Who is the audience (e.g. target worker community) – Special needs communities (e.g. simple color blindness) • Parsimony • Cognitive load: mental rigor needed to perform task • Exposure effect July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 115
  • 116. Platform alternatives • Why MTurk – Amazon brand, lots of research papers – Speed, price, diversity, payments • Why not – Crowdsourcing != Mturk – Spam, no analytics, must build tools for worker & task quality • How to build your own crowdsourcing platform – Back-end – Template language for creating experiments – Scheduler – Payments? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 116
  • 117. The human side • As a worker – I hate when instructions are not clear – I’m not a spammer – I just don’t get what you want – Boring task – A good pay is ideal but not the only condition for engagement • As a requester – Attrition – Balancing act: a task that would produce the right results and is appealing to workers – I want your honest answer for the task – I want qualified workers; system should do some of that for me • Managing crowds and tasks is a daily activity – more difficult than managing computers July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 117
  • 118. Things that work • Qualification tests • Honey-pots • Good content and good presentation • Economy of attention • Things to improve – Manage workers in different levels of expertise including spammers and potential cases. – Mix different pools of workers based on different profile and expertise levels. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 118
  • 119. Things that need work • UX and guidelines – Help the worker – Cost of interaction • Scheduling and refresh rate • Exposure effect • Sometimes we just don’t agree • How crowdsourcable is your task July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 119
  • 120. V FROM LABELING TO HUMAN COMPUTATION July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 120
  • 121. The Turing Test (Alan Turing, 1950) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 121
  • 122. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 122
  • 123. The Turing Test (Alan Turing, 1950) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 123
  • 124. What is a Computer? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 124
  • 125. • What was old becomes new • “Crowdsourcing: A New Branch of Computer Science” (March 29, 2011) Princeton University Press, 2005 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 125
  • 126. Davis et al. (2010) The HPU. HPU July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 126
  • 127. Human Computation Rebirth of people as ‘computists’; people do tasks computers cannot (do well) Stage 1: Detecting robots – CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart – No useful work produced; people just answer questions with known answers Stage 2: Labeling data (at scale) – E.g. ESP game, typical use of MTurk – Game changer for AI: starving for data Stage 3: General “human computation” (HPU) – people do arbitrarily sophisticated tasks (i.e. compute arbitrary functions) – HPU as core component in system architecture, many “HPC” invocations – blend HPU with automation for a new class of hybrid applications – New tradeoffs possible in latency/cost vs. functionality/accuracy L. von Ahn has pioneered the field. See bibliography for examples of his work. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 127
  • 128. Mobile Phone App: “Amazon Remembers” July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 128
  • 129. ReCaptcha L. von Ahn et al. (2008). In Science. Harnesses human work as invisible by product. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 129
  • 130. CrowdSearch and mCrowd • T. Yan, MobiSys 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 130
  • 131. Soylent: A Word Processor with a Crowd Inside • Bernstein et al., UIST 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 131
  • 132. Translation by monolingual speakers • C. Hu, CHI 2009 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 132
  • 133. fold.it • S. Cooper et al. (2010). July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 133
  • 134. VI WORKER INCENTIVES July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 134
  • 135. Worker Incentives • Pay ($$$) • Fun (or avoid boredom) • Socialize • Earn acclaim/prestige • Altruism • Learn something new (e.g. English) • Unintended by-product (e.g. re-Captcha) • Create self-serving resource (e.g. Wikipedia) Multiple incentives are typically at work in parallel July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 135
  • 136. Pay ($$$) P. Ipeirotis March 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 136
  • 137. Pro • Ready marketplaces (e.g. MTurk, CrowdFlower, …) • Less need for creativity • Simple motivation knob Con: Quality and Quality control required • Can diminish intrinsic rewards that promote quality: – Fun/altruistic value of task – Taking pride in doing quality work Pay ($$$) – Self-assessment • Can attract workers only interested in the pay, fraud • $$$ (though other schemes cost indirectly) How much to pay? • Mason & Watts 2009: more $ = more work, not better work • Wang et al. 2011: predict from market? • More later… Zittrain 2010: if Encarta had paid for contributions, would we have Wikipedia? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 137
  • 138. Fun (or avoid boredom) • Games with a Purpose (von Ahn) – Data is by-product – IR: Law et al. SearchWar. HCOMP 2009. • distinct from Serious Gaming / Edutainment – Player learning / training / education is by-product July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 138
  • 139. Learning to map from web pages to queries • Human computation game to elicit data • Home grown system (no AMT) • Try it! pagehunt.msrlivelabs.com See also: • H. Ma. et al. “Improving Search Engines Using Human Computation Games”, CIKM 2009. • Law et al. SearchWar. HCOMP 2009. • Bennett et al. Picture This. HCOMP 2009. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 139
  • 140. Fun (or avoid boredom) • Pro: – Enjoyable “work” people want to do (or at least better than anything else they have to do) – Scalability potential from involving non-workers • Con: – Need for design creativity • some would say this is a plus • better performance in game should produce better/more work – Some tasks more amenable than others • Annotating syntactic parse trees for fun? • Inferring syntax implicitly from a different activity? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 140
  • 141. Socialization & Prestige July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 141
  • 142. Socialization & Prestige • Pro: – “free” – enjoyable for connecting with one another – can share infrastructure across tasks • Con: – need infrastructure beyond simple micro-task – need critical mass (for uptake and reward) – social engineering knob more complex than $$ July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 142
  • 143. Altruism • Contribute knowledge • Help others (who need knowledge) • Help workers (e.g. SamaSource) • Charity (e.g. http://www.freerice.com) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 143
  • 144. Altruism • Pro – “free” – can motivate quality work for a cause • Con – Seemingly small workforce for pure altruism What if Mechanical Turk let you donate $$ per HIT? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 144
  • 145. Unintended by-product • Pro – effortless (unnoticed) work – Scalability from involving non-workers • Con – Design challenge • Given existing activity, find useful work to harness from it • Given target work, find or create another activity for which target work is by-product? – Maybe too invisible (disclosure, manipulation) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 145
  • 146. Multiple Incentives • Ideally maximize all • Wikipedia, cQA, Gwap – fun, socialization, prestige, altruism • Fun vs. Pay – gwap gives Amazon certificates – Workers maybe paid in game currency – Pay tasks can also be fun themselves • Pay-based – Other rewards: e.g. learn something, socialization – altruism: worker (e.g. SamaSource) or task itself – social network integration could help everyone (currently separate and lacking structure) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 146
  • 147. VII THE ROAD AHEAD July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 147
  • 148. Wisdom of Crowds (WoC) Requires • Diversity • Independence • Decentralization • Aggregation Input: large, diverse sample (to increase likelihood of overall pool quality) Output: consensus or selection (aggregation) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 148
  • 149. WoC vs. Ensemble Learning • Combine multiple models to improve performance over any constituent model – Can use many weak learners to make a strong one – Compensate for poor models with extra computation • Works better with diverse, independent learners • cf. NIPS’10 Workshop – Computational Social Science & the Wisdom of Crowds • More investigation needed of traditional feature- based machine learning & ensemble methods for consensus labeling with crowdsourcing July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 149
  • 150. Unreasonable Effectiveness of Data • Massive free Web data changed how we train learning systems – Banko and Brill (2001). Human Language Tech. – Halevy et al. (2009). IEEE Intelligent Systems. • How might access to cheap & plentiful labeled data change the balance again? July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 150
  • 151. MapReduce with human computation • Commonalities – Large task divided into smaller sub-problems – Work distributed among worker nodes (workers) – Collect all answers and combine them – Varying performance of heterogeneous CPUs/HPUs • Variations – Human response latency / size of “cluster” – Some tasks are not suitable July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 151
  • 152. CrowdForge: MapReduce for Automation + Human Computation Kittur et al., CHI 2011 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 152
  • 153. Research problems – operational • Methodology – Budget, people, document, queries, presentation, incentives, etc. – Scheduling – Quality • What’s the best “mix” of HC for a task? • What are the tasks suitable for HC? • Can I crowdsource my task? – Eickhoff and de Vries, WSDM 2011 CSDM Workshop July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 153
  • 154. More problems • Human factors vs. outcomes • Editors vs. workers • Pricing tasks • Predicting worker quality from observable properties (e.g. task completion time) • HIT / Requestor ranking or recommendation • Expert search : who are the right workers given task nature and constraints • Ensemble methods for Crowd Wisdom consensus July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 154
  • 155. Problems: crowds, clouds and algorithms • Infrastructure – Current platforms are very rudimentary – No tools for data analysis • Dealing with uncertainty (propagate rather than mask) – Temporal and labeling uncertainty – Learning algorithms – Search evaluation – Active learning (which example is likely to be labeled correctly) • Combining CPU + HPU – Human Remote Call? – Procedural vs. declarative? – Integration points with enterprise systems July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 155
  • 156. Conclusions • Crowdsourcing for relevance evaluation works • Fast turnaround, easy to experiment, cheap • Still have to design the experiments carefully! • Usability considerations • Worker quality • User feedback extremely useful July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 156
  • 157. Conclusions - II • Crowdsourcing is here to stay • Lots of opportunities to improve current platforms • Integration with current systems • MTurk is a popular platform and others are emerging • Open research problems July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 157
  • 158. VIII RESOURCES AND REFERENCES July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 158
  • 159. Little written for the general public • July 2010, kindle-only • “This book introduces you to the top crowdsourcing sites and outlines step by step with photos the exact process to get started as a requester on Amazon Mechanical Turk.“ July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 159
  • 160. Crowdsourcing @ SIGIR’11  Workshop on Crowdsourcing for Information Retrieval  Roi Blanco, Harry Halpin, Daniel Herzig, Peter Mika, Jeffrey Pound, Henry Thompson, Thanh D. Tran. “Repeatable and Reliable Search System Evaluation using Crowd-Sourcing”.  Yen-Ta Huang, An-Jung Cheng, Liang-Chi Hsieh, Winston H. Hsu, Kuo-Wei Chang. “Region-Based Landmark Discovery by Crowdsourcing Geo-Referenced Photos.” Poster.  Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling. “Crowdsourcing for Book Search Evaluation: Impact of Quality on Comparative System Ranking.”  Abhimanu Kumar, Matthew Lease . “Learning to Rank From a Noisy Crowd”. Poster.  Edith Law, Paul N. Bennett, and Eric Horvitz. “The Effects of Choice in Routing Relevance Judgments”. Poster. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 160
  • 161. 2011 Workshops & Conferences SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28) • WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9) • CHI-CHC: Crowdsourcing and Human Computation (May 8) • Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13) • Crowdsourcing Technologies for Language and Cognition Studies (July 27) • 2011 AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8) • UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18) • CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”) • CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2) • TREC-Crowd: Crowdsourcing Track at TREC (Nov. 16-18) • ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 161
  • 162. 2011 Tutorials • SIGIR (yep, this is it!) • WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You – Omar Alonso and Matthew Lease (Feb. 9) • WWW: Managing Crowdsourced Human Computation – Panos Ipeirotis and Praveen Paritosh (March 29) • HCIC: Quality Crowdsourcing for Human Computer Interaction Research – Ed Chi (June 14-18) – Also see Chi’s Crowdsourcing for HCI Research with Amazon Mechanical Turk • AAAI: Human Computation: Core Research Questions and State of the Art – Edith Law and Luis von Ahn (Aug. 7) • VLDB: Crowdsourcing Applications and Platforms – AnHai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska (Aug. 29) • CrowdConf: Crowdsourcing for Fun and Profit – Omar Alonso and Matthew Lease (Nov. 1) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 162
  • 163. Other Events & Resources ir.ischool.utexas.edu/crowd 2011 book: Omar Alonso, Gabriella Kazai, and Stefano Mizzaro. Crowdsourcing for Search Engine Evaluation: Why and How. Forthcoming special issue of Springer’s Information Retrieval journal on Crowdsourcing July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 163
  • 164. Thank You! For questions about tutorial or crowdsourcing, email: omar.alonso@microsoft.com ml@ischool.utexas.edu Cartoons by Mateo Burtch (buta@sonic.net) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 164
  • 165. Crowdsourcing in IR: 2008-2010  2008  O. Alonso, D. Rose, and B. Stewart. “Crowdsourcing for relevance evaluation”, SIGIR Forum, Vol. 42, No. 2.  2009  O. Alonso and S. Mizzaro. “Can we get rid of TREC Assessors? Using Mechanical Turk for … Assessment”. SIGIR Workshop on the Future of IR Evaluation.  P.N. Bennett, D.M. Chickering, A. Mityagin. Learning Consensus Opinion: Mining Data from a Labeling Game. WWW.  G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments”, SIGIR.  G. Kazai and N. Milic-Frayling. “… Quality of Relevance Assessments Collected through Crowdsourcing”. SIGIR Workshop on the Future of IR Evaluation.  Law et al. “SearchWar”. HCOMP.  H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. “Improving Search Engines Using Human Computation Games”, CIKM 2009.  2010  SIGIR Workshop on Crowdsourcing for Search Evaluation.  O. Alonso, R. Schenkel, and M. Theobald. “Crowdsourcing Assessments for XML Ranked Retrieval”, ECIR.  K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”, ECIR.  C. Grady and M. Lease. “Crowdsourcing Document Relevance Assessment with Mechanical Turk”. NAACL HLT Workshop on … Amazon's Mechanical Turk.  Grace Hui Yang, Anton Mityagin, Krysta M. Svore, and Sergey Markov . “Collecting High Quality Overlapping Labels at Low Cost”. SIGIR.  G. Kazai. “An Exploration of the Influence that Task Parameters Have on the Performance of Crowds”. CrowdConf.  G. Kazai. “… Crowdsourcing in Building an Evaluation Platform for Searching Collections of Digitized Books”., Workshop on Very Large Digital Libraries (VLDL)  Stephanie Nowak and Stefan Ruger. How Reliable are Annotations via Crowdsourcing? MIR.  Jean-François Paiement, Dr. James G. Shanahan, and Remi Zajac. “Crowdsourcing Local Search Relevance”. CrowdConf.  Maria Stone and Omar Alonso. “A Comparison of On-Demand Workforce with Trained Judges for Web Search Relevance Evaluation”. CrowdConf.  T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys pp. 77--90, 2010. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 165
  • 166. Crowdsourcing in IR: 2011  WSDM Workshop on Crowdsourcing for Search and Data Mining.  SIGIR Workshop on Crowdsourcing for Information Retrieval.  SIGIR papers/posters mentioned earlier  O. Alonso and R. Baeza-Yates. “Design and Implementation of Relevance Assessments using Crowdsourcing, ECIR.  G. Kasneci, J. Van Gael, D. Stern, and T. Graepel, CoBayes: Bayesian Knowledge Corroboration with Assessors of Unknown Areas of Expertise, WSDM.  Hyun Joon Jung, Matthew Lease . “Improving Consensus Accuracy via Z-score and Weighted Voting”. HCOMP. Poster.  Gabriella Kazai,. “In Search of Quality in Crowdsourcing for Search Engine Evaluation”, ECIR. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 166
  • 167. Bibliography: General IR  M. Hearst. “Search User Interfaces”, Cambridge University Press, 2009  K. Jarvelin, and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. Proceedings of the 23rd annual international ACM SIGIR conference . pp.41—48, 2000.  M. Kaisser, M. Hearst, and L. Lowe. “Improving Search Results Quality by Customizing Summary Lengths”, ACL/HLT, 2008.  D. Kelly. “Methods for evaluating interactive information retrieval systems with users”. Foundations and Trends in Information Retrieval, 3(1-2), 1-224, 2009.  S. Mizzaro. Measuring the agreement among relevance judges, MIRA 1999  J. Tang and M. Sanderson. “Evaluation and User Preference Study on Spatial Diversity”, ECIR 2010 July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 167
  • 168. Bibliography: Other  J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.  Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.  Bederson, B.B., Hu, C., & Resnik, P. Translation by Iteractive Collaboration between Monolingual Users, Proceedings of Graphics Interface (GI 2010), 39-46.  N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.  C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.  P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.  J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human in the Loop (ACVHL), June 2010.  M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.  D. A. Grief. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579  JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.  J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.  P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.  J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.  P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT Workshop on Active Learning and NLP, 2009.  B. Huberman, D. Romero, and F. Wu. “Crowdsouring, attention and productivity”. Journal of Information Science, 2009.  P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.  P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.  P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010) July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 168
  • 169. Bibliography: Other (2)  A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.  Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011  Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.  K. Krippendorff. "Content Analysis", Sage Publications, 2003  G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.  T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence. 2009.  W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.  J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.  A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009  J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting Demographics in Amazon Mechanical Turk”. CHI 2010.  F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.  R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks”. EMNLP-2008.  V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers” KDD 2008.  S. Weber. “The Success of Open Source”, Harvard University Press, 2004.  L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.  L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 169
  • 170. Bibliography: Other (3)  C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.  AnHai Doan, Raghu Ramakrishnan, Alon Y. Halevy: Crowdsourcing systems on the World-Wide Web. CACM, 2011  Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011. July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 170
  • 171. Other Resources Blogs  Behind Enemy Lines (P.G. Ipeirotis, NYU)  Deneme: a Mechanical Turk experiments blog (Gret Little, MIT)  CrowdFlower Blog  http://experimentalturk.wordpress.com  Jeff Howe Sites  The Crowdsortium  Crowdsourcing.org  CrowdsourceBase (for workers)  Daily Crowdsource MTurk Forums and Resources  Turker Nation: http://turkers.proboards.com  http://www.turkalert.com (and its blog)  Turkopticon: report/avoid shady requestors  Amazon Forum for MTurk July 24, 2011 Crowdsourcing for Information Retrieval: Principles, Methods, and Applications 171