SlideShare una empresa de Scribd logo
1 de 82
Crowd Computing:
Opportunities and Challenges

                         Matt Lease
                   School of Information
                 University of Texas at Austin
                   ml@ischool.utexas.edu
                         @mattlease
2
Amazon Mechanical Turk (MTurk)




• Marketplace for crowd labor (microtasks)
• Created in 2005 (still in “beta”)
• On-demand, scalable, 24/7 global workforce

                                               3
4
5
Labeling Data (“Gold Rush”)




                              6
Snow et al. (EMNLP 2008)
• MTurk annotation for 5 Tasks
  – Affect recognition
  – Word similarity
  – Recognizing textual entailment
  – Event temporal ordering
  – Word sense disambiguation
• 22K labels for US $26
• High agreement between
  consensus labels and
  gold-standard labels
                                     7
Alonso et al. (SIGIR Forum 2008)
• MTurk for Information Retrieval (IR)
  – Judge relevance of search engine results
• Many follow-on studies (design, quality, cost)




                                                   8
Sorokin & Forsythe (CVPR 2008)
• MTurk for Computer Vision
• 4K labels for US $60




                                 9
Studying People




                  10
Kittur, Chi, & Suh (CHI 2008)

• MTurk for User Studies

• “…make creating believable invalid responses as
  effortful as completing the task in good faith.”




                                                 11
Jane saw the man with the binoculars




                                   12
Social & Behavioral Sciences
• A Guide to Behavioral Experiments
  on Mechanical Turk
   – W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
   – L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
  Insights from Mechanical Turk
   – Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
  Inexpensive, Yet High-Quality, Data?
   – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
   – see also: Amazon Mechanical Turk Guide for Social Scientists
                                                                    13
Studying Interactive Systems




                               14
Remote Usability Testing
• D. Liu, R. Kuipers, M. Lease, and R. Bias (in preparation)
• On-site vs. crowdsourced usability testing
• Advantages
   –   More Participants
   –   More Diverse Participants
   –   High Speed
   –   Low Cost
• Disadvantages
   –   Lower Quality Feedback
   –   Less Interaction
   –   Greater need for quality control
   –   Less Focused User Groups
                                                          15
Beyond MTurk




               16
ESP Game (Games With a Purpose)
L. Von Ahn and L. Dabbish (2004)




                                   17
18
reCaptcha




L. von Ahn et al. (2008). In Science.
                                        19
Human Sensing and Monitoring
• Sullivan et al. (2009). Bio. Conservation (142):10
• Keynote by Steve Kelling at ASIS&T 2011




                                                  20
21
Human Computation




                    22
The Mechanical Turk




The original, constructed and
unveiled in 1770 by Wolfgang
von Kempelen (1734–1804)

           J. Pontin. Artificial Intelligence, With Help From
           the Humans. New York Times (March 25, 2007)
                                                                23
The Turing Test (Alan Turing, 1950)




                                  24
25
The Turing Test (Alan Turing, 1950)




                                  26
What is a Computer?




                      27
• What was old is new

• Crowdsourcing: A New
  Branch of Computer Science
  – D.A. Grier, March 29, 2011

• Tabulating the heavens:
  computing the Nautical
  Almanac in 18th-century
  England
  – M. Croarken (2003)           Princeton University Press, 2005
                                                           28
The Human Processing Unit (HPU)
• Davis et al. (2010)




                        HPU



                               29
Human Computation
• Luis von Ahn (2005)
• Investigates use of people to execute certain
  computations for which capabilities of current
  automated methods are more limited
• Explores the metaphor of computation for
  characterizing attributes, capabilities, and
  limitations of human performance in
  executing desired tasks
• Having people do stuff (instead of computers)
                                               30
Blending Automation &
 Human Computation




                        31
“Amazon Remembers”




                     32
Ethics Checking: The Next Frontier?
• Mark Johnson’s address at ACL 2003
  – Transcript in Conduit 12(2) 2003


• Think how useful a little “ethics checker and
  corrector” program integrated into a word
  processor could be!



                                                  33
Soylent: A Word Processor with a Crowd Inside

 • Bernstein et al., UIST 2010




                                          34
Translation by monolingual speakers
• C. Hu, CHI 2009




                                       35
fold.it
S. Cooper et al. (2010)




Alice G. Walton. Online Gamers Help Solve Mystery of
Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011.
                                                      36
CrowdSearch and mCrowd
• T. Yan, MobiSys 2010




                              37
VizWiz               aaaaaaaa
Bingham et al. (UIST 2010)




                                        38
39
Crowdsourcing
• Jeff Howe. Wired, June 2006.
• Take a job traditionally
  performed by a known agent
  (often an employee)
• Outsource it to an undefined,
  generally large group of
  people via an open call
• New application of principles
  from open source movement
                                  40
41
What is Crowdsourcing?
• A collection of mechanisms and associated
  methodologies for scaling and directing crowd
  activities to achieve some goal(s)
• Enabled by internet-connectivity
• Many related areas
  – Human computation
  – Collective intelligence
  – Social computing
  – People services
                                              42
Crowdsourcing Key Questions
• What are the goals?
  – Purposeful directing of human activity

• How can you incentivize participation?
  – Incentive engineering
  – Who is your intended crowd?

• Which model(s) are most appropriate?
  – How to adapt them to your context and goals?

                                                   43
44
What about sensitive data?
• Not all data can be publicly disclosed
  – User data (e.g. AOL query log, Netflix ratings)
  – Intellectual property
  – Legal confidentiality
• Need to restrict who is in your crowd
  – Separate channel (workforce) from technology
  – Hot question for adoption at enterprise level



                                                      45
What about the law?
• Wolfson & Lease (ASIS&T 2011)
• As usual, technology is ahead of the law
  – employment law
  – patent inventorship
  – data security and the Federal Trade Commission
  – copyright ownership
  – securities regulation of crowdfunding
• Take-away: don’t panic, but be mindful
  – Understand risks of “just in-time compliance”

                                                     46
Nature of Micro-tasks
• Small, simple tasks which can be completed
  without extraneous detail or context
  – e.g. “Can you name who is in this photo?”




• Variety of current research investigating how
  to decompose complex tasks into simpler ones
                                                47
Context Matters




                  48
What about trust?
• Some reports of robot “workers” on MTurk
  – E.g. McCreadie et al. (2011)
  – Violates terms of service
• Why not just use a captcha?




                                             49
Captcha Fraud




                50
Requester Fraud on MTurk
“Do not do any HITs that involve: filling in
CAPTCHAs; secret shopping; test our web page;
test zip code; free trial; click my link; surveys or
quizzes (unless the requester is listed with a
smiley in the Hall of Fame/Shame); anything
that involves sending a text message; or
basically anything that asks for any personal
information at all—even your zip code. If you
feel in your gut it’s not on the level, IT’S NOT.
Why? Because they are scams...”
                                                       51
Who are
the workers?


• A. Baio, November 2008. The Faces of Mechanical Turk.
• P. Ipeirotis. March 2010. The New Demographics of
  Mechanical Turk
• J. Ross, et al. Who are the Crowdworkers?... CHI 2010.
                                                       52
What about ethics?
• Silberman, Irani, and Ross (2010)
  – “How should we… conceptualize the role of these
    people who we ask to power our computing?”
  – Power dynamics between parties
     • What are the consequences for a worker
       when your actions harm their reputation?
  – “Abstraction hides detail”

• Fort, Adda, and Cohen (2011)
  – “…opportunities for our community to deliberately
    value ethics above cost savings.”
                                                        53
Davis et al. (2010) The HPU.




               HPU




                               54
HPU: “Abstraction hides detail”
• Not just turning a mechanical crank




                                        55
56
What about quality?




                      57
What about quality?
• Many papers on statistical methods for MTurk
  – Worker calibration, noise vs. bias, weighted voting
  – Checking consensus may discourage worker honesty
• Garbage in = garbage out
  – Only as strong as your weakest link (end-to-end)
  – Is your shiny statistical hammer what’s really needed?
  – Not all problems are technological
• Methods for consistent annotation still apply
                                                          58
What about quality? (2)
• Human factors matter
  – Part of your experimental design (study & report)
  – Instructions, design, interface, interaction
  – Names, relationship, reputation (Klinger & Lease 2011)
  – Fair pay, hourly vs. per-task, recognition, advancement
  – For contrast, consider Kochhar (2010)
• How good is gold really?
  – Klebanov and Beigman (NAACL 2010)
  – Model uncertainty in ground truth
     • Training and evaluation with uncertain labels
     • Temporal and label uncertainty in active learning   59
Need for Benchmarks
• How well do different crowdsourcing methods
  perform on comparable data?
  – Shared datasets used to bring people together
• Common tasks to consider
  –   Translation
  –   Transcription
  –   Relevance Judging (search engines)
  –   Generation (resources, SEO, reputation)
  –   Verification (correct, copyright, appropriate)
• NIST TREC Crowdsourcing Track exploring this
  – Track will run for 2nd Year in 2012                60
The Road Ahead




                 61
Why Eytan Adar hates MTurk Research
      (CHI 2011 CHC Workshop)
• Overly-narrow focus on MTurk
  – Identify general vs. platform-specific problems
  – Academic vs. Industrial problems
• Inattention to prior work in other disciplines
  – Some problems well-studied in other areas
  – Human behavior hasn’t changed much
• Turks aren’t Martians
  – How many prior user studies need to be
    reproduced on MTurk before we believe it?
                                                      62
Many choices beyond MTurk
•   Clickworker      Industry heavy-weights
•   CloudCrowd          Elance
•   CloudFactory        Liveops
•   CrowdSource         oDesk
•   DoMyStuff           uTest
•   Humanoid
•   Microtask         And More!
                      JobBoy, microWorkers, MiniFreelance,
•   MobileWorks       MiniJobz, MinuteWorkers, MyEasyTask,
                      OpTask, ShortTask, SimpleWorkers
•   myGengo
•   SmartSheet
•   vWorker                                        63
64
Run-time automation for managing
     distributed crowd computing
• Kittur et al. (CHI 2011), CrowdForge




                                         65
MapReduce for human computation?
•   Large task divided into smaller sub-problems
•   Job distributed among multiple workers
•   Collect all answers and combine them
•   Varying performance of heterogeneous
    CPUs/HPUs




                                                   66
Wisdom of Crowds Computing
Pre-conditions
• Diversity
• Independence
• Decentralization
• Aggregation

Input: large, diverse sample (increases
       likelihood of overall pool quality)
Output: consensus, selection, distribution

What about ensemble techniques and theory
we have for integrating multiple noisy models?
  Computational Social Science & the Wisdom of Crowds
  NIPS 2010-2011 workshops                            67
• How to effectively compute in the crowd?
  – What are all the knobs and dials?
  – How to set them to navigate quality vs. cost vs. time?

• How do we design, implement, test, & maintain
  crowd computing systems?
• What new capabilities can such systems provide?
• How does cheaper / faster / easier / noisier data
  change the way we build intelligent systems?
  – Relative costs of all other activities increase
                                                        68
Unreasonable Effectiveness of Data
• Massive free Web data
  changed how we train
  learning systems
  – Banko and Brill (2001).
    Human Language Tech.
  – Halevy et al. (2009). IEEE
    Intelligent Systems.

 • How might access to cheap & plentiful
   labeled data change the balance again?
                                            69
• Who is the right person for the job?
  – Requesting or inferring skills / experience
  – Interactive task selection or automatic routing
  – How to represent, measure, model, estimate, and
    utilize individual worker expertise/accuracy?
• How should crowd computing evolve from here?
  – What should next generation infrastructure provide?
  – What are the right programming constructs?
• With automation and human computation, who
  does what? Mixed-initiative thinking.

                                                      70
Crowdsourcing in 2012
• Conferences and Workshops
   –   AAAI Symposium: Wisdom of the Crowd (March 26-28)
   –   Collective Intelligence (papers: Nov 18, date: April 18-20)
   –   Year 2 of TREC Crowdsourcing Track
   –   HComp and CrowdConf (details TBD)
• Journal Special Issues
   – Springer’s Information Retrieval:
     Crowdsourcing for Information Retrieval
   – Hindawi’s Advances in Multimedia Journal:
     Multimedia Semantics Analysis via Crowdsourcing Geocontext
   – IEEE Internet Computing: Crowdsourcing (Sept./Oct. 2012)
   – IEEE Transactions on Multimedia:
     Crowdsourcing in Multimedia (proposal in review)
• Places for News
   – Follow the Crowd
   – The Crowdsortium                                                71
2011 Tutorials and Keynotes
•   By Omar Alonso and/or Matthew Lease
     –   CLEF: Crowdsourcing for Information Retrieval Experimentation and Evaluation (Sep. 20, Omar only)
     –   CrowdConf (Nov. 1, this is it!)
     –   IJCNLP: Crowd Computing: Opportunities and Challenges (Nov. 10, Matt only)
     –   WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You (Feb. 9)
     –   SIGIR: Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (July 24)

•   AAAI: Human Computation: Core Research Questions and State of the Art
     –   Edith Law and Luis von Ahn, August 7
•   ASIS&T: How to Identify Ducks In Flight: A Crowdsourcing Approach to Biodiversity Research and
    Conservation
     –   Steve Kelling, October 10, ebird
•   EC: Conducting Behavioral Research Using Amazon's Mechanical Turk
     –   Winter Mason and Siddharth Suri, June 5
•   HCIC: Quality Crowdsourcing for Human Computer Interaction Research
     –   Ed Chi, June 14-18, about HCIC)
     –   Also see his: Crowdsourcing for HCI Research with Amazon Mechanical Turk
•   Multimedia: Frontiers in Multimedia Search
     –   Alan Hanjalic and Martha Larson, Nov 28
•   VLDB: Crowdsourcing Applications and Platforms
     –   Anhai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska)
•   WWW: Managing Crowdsourced Human Computation
     –   Panos Ipeirotis and Praveen Paritosh

                                                                                                             72
2011 Workshops & Conferences
•   AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8)
•   ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2)
•   Crowdsourcing Technologies for Language and Cognition Studies (July 27)
•   CHI-CHC: Crowdsourcing and Human Computation (May 8)
•   CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”)
•   CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2)
•   Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13)
•   EC: Workshop on Social Computing and User Generated Content (June 5)
•   ICWE: 2nd International Workshop on Enterprise Crowdsourcing (June 20)
•   Interspeech: Crowdsourcing for speech processing (August)
•   NIPS: Second Workshop on Computational Social Science and the Wisdom of Crowds (Dec. TBD)
•   SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28)
•   TREC-Crowd: Year 1 of TREC Crowdsourcing Track (Nov. 16-18)
•   UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18)
•   WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9)
                                                                                              73
Recent Overview Papers
• Alex Quinn and Ben Bederson. Human Computation: A
  Survey and Taxonomy of a Growing Field. CHI 2011.
• Man-Ching Yuen, Irwin King, and Kwong-Sak Leung. A
  Survey of Crowdsourcing Systems. SocialCom 2011.
• Rajarshi Das and Maja Vukovic. Emerging theories and
  models of human computation systems: a brief survey.
  UbiCrowd 2011
• A. Doan, R. Ramakrishnan, A. Halevy. Crowdsourcing
  Systems on the World-Wide Web. Communications of
  the ACM, 2011.


                                                    74
Books
• Omar Alonso, Gabriella Kazai, and Stefano
  Mizzaro. (2012). Crowdsourcing for Search
  Engine Evaluation: Why and How.

• Law and von Ahn (2011).
  Human Computation




                                              75
More Books
July 2010, kindle-only: “This book introduces you to the
top crowdsourcing sites and outlines step by step with
photos the exact process to get started as a requester on
Amazon Mechanical Turk.“




                                                    76
Thank You!
                 ir.ischool.utexas.edu/crowd
• Students
  –   Catherine Grady (iSchool)       Matt Lease
  –   Hyunjoon Jung (ECE)             ml@ischool.utexas.edu
  –   Jorn Klinger (Linguistics)       @mattlease
  –   Adriana Kovashka (CS)
  –   Abhimanu Kumar (CS)
  –   Di Liu (iSchool)
  –   Hohyon Ryu (iSchool)
  –   William Tang (CS)
  –   Stephen Wolfson (iSchool)
• Omar Alonso, Microsoft Bing
• Support
  – John P. Commons                                     77
Bibliography
   J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.
   Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.
   Bederson, B.B., Hu, C., & Resnik, P. Translation by Interactive Collaboration between Monolingual Users, Proceedings of Graphics
    Interface (GI 2010), 39-46.
   N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.
   C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.
   P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.
   J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human
    in the Loop (ACVHL), June 2010.
   M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.
   D. A. Grier. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579
   JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.
   J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.
   P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.
   J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.
   P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT
    Workshop on Active Learning and NLP, 2009.
   B. Huberman, D. Romero, and F. Wu. “Crowdsourcing, attention and productivity”. Journal of Information Science, 2009.
   P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.
   P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.
   P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010)


                                                                                                                                78
Bibliography (2)
   A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.
   Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011
   Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.
   K. Krippendorff. "Content Analysis", Sage Publications, 2003
   G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.
   T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence.
    2009.
   W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.
   J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.
   A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009
   J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting
    Demographics in Amazon Mechanical Turk”. CHI 2010.
   F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.
   R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations
    for Natural Language Tasks”. EMNLP-2008.
   V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers”
    KDD 2008.
   S. Weber. “The Success of Open Source”, Harvard University Press, 2004.
   L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.
   L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008.

                                                                                                                     79
Bibliography (3)
   Shuo Chen et al. What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and
    Clustering on Teachers. AAAI 2010.
   Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011.
   Florian Laws, Christian Scheible and Hinrich Schütze. Active Learning with Amazon Mechanical Turk.
    EMNLP 2011.
   C.Y. Lin. Rouge: A package for automatic evaluation of summaries. Proceedings of the workshop on text
    summarization branches out (WAS), 2004.
   C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.
   Hohyon Ryu and Matthew Lease. Crowdworker Filtering with Support Vector Machine. ASIS&T 2011.
   Wei Tang and Matthew Lease. Semi-Supervised Consensus Labeling for Crowdsourcing. ACM SIGIR
    Workshop on Crowdsourcing for Information Retrieval (CIR), 2011.
   S. Vijayanarasimhan and K. Grauman. Large-Scale Live Active Learning: Training Object Detectors with
    Crawled Data and Crowds. CVPR 2011.
   Stephen Wolfson and Matthew Lease. Look Before You Leap: Legal Pitfalls of Crowdsourcing. ASIS&T 2011.




                                                                                                        80
Crowdsourcing in IR: 2008-2010
   2008
          O. Alonso, D. Rose, and B. Stewart. “Crowdsourcing for relevance evaluation”, SIGIR Forum, Vol. 42, No. 2.

   2009
          O. Alonso and S. Mizzaro. “Can we get rid of TREC Assessors? Using Mechanical Turk for … Assessment”. SIGIR Workshop on the Future of IR Evaluation.
          P.N. Bennett, D.M. Chickering, A. Mityagin. Learning Consensus Opinion: Mining Data from a Labeling Game. WWW.
          G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments”, SIGIR.
          G. Kazai and N. Milic-Frayling. “… Quality of Relevance Assessments Collected through Crowdsourcing”. SIGIR Workshop on the Future of IR Evaluation.
          Law et al. “SearchWar”. HCOMP.
          H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. “Improving Search Engines Using Human Computation Games”, CIKM 2009.

   2010
          SIGIR Workshop on Crowdsourcing for Search Evaluation.
          O. Alonso, R. Schenkel, and M. Theobald. “Crowdsourcing Assessments for XML Ranked Retrieval”, ECIR.
          K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”, ECIR.
          C. Grady and M. Lease. “Crowdsourcing Document Relevance Assessment with Mechanical Turk”. NAACL HLT Workshop on … Amazon's Mechanical Turk.
          Grace Hui Yang, Anton Mityagin, Krysta M. Svore, and Sergey Markov . “Collecting High Quality Overlapping Labels at Low Cost”. SIGIR.
          G. Kazai. “An Exploration of the Influence that Task Parameters Have on the Performance of Crowds”. CrowdConf.
          G. Kazai. “… Crowdsourcing in Building an Evaluation Platform for Searching Collections of Digitized Books”., Workshop on Very Large Digital Libraries (VLDL)
          Stephanie Nowak and Stefan Ruger. How Reliable are Annotations via Crowdsourcing? MIR.
          Jean-François Paiement, Dr. James G. Shanahan, and Remi Zajac. “Crowdsourcing Local Search Relevance”. CrowdConf.
          Maria Stone and Omar Alonso. “A Comparison of On-Demand Workforce with Trained Judges for Web Search Relevance Evaluation”. CrowdConf.
          T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys pp. 77--90, 2010.




                                                                                                                                                               81
Crowdsourcing in IR: 2011
   WSDM Workshop on Crowdsourcing for Search and Data Mining.
   SIGIR Workshop on Crowdsourcing for Information Retrieval.


   O. Alonso and R. Baeza-Yates. “Design and Implementation of Relevance Assessments using Crowdsourcing, ECIR 2011.
   Roi Blanco, Harry Halpin, Daniel Herzig, Peter Mika, Jeffrey Pound, Henry Thompson, Thanh D. Tran. “Repeatable and
    Reliable Search System Evaluation using Crowd-Sourcing”. SIGIR 2011.
   Yen-Ta Huang, An-Jung Cheng, Liang-Chi Hsieh, Winston H. Hsu, Kuo-Wei Chang. “Region-Based Landmark Discovery by
    Crowdsourcing Geo-Referenced Photos.” SIGIR 2011.
   Hyun Joon Jung, Matthew Lease . “Improving Consensus Accuracy via Z-score and Weighted Voting”. HCOMP 2011.
   G. Kasneci, J. Van Gael, D. Stern, and T. Graepel, CoBayes: Bayesian Knowledge Corroboration with Assessors of
    Unknown Areas of Expertise, WSDM 2011.
   Gabriella Kazai,. “In Search of Quality in Crowdsourcing for Search Engine Evaluation”, ECIR 2011.
   Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling. “Crowdsourcing for Book Search Evaluation: Impact of Quality
    on Comparative System Ranking.” SIGIR 2011.
   Abhimanu Kumar, Matthew Lease . “Learning to Rank From a Noisy Crowd”. SIGIR 2011.
   Edith Law, Paul N. Bennett, and Eric Horvitz. “The Effects of Choice in Routing Relevance Judgments”. SIGIR 2011.




                                                                                                                        82

Más contenido relacionado

La actualidad más candente

What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Matthew Lease
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?Matthew Lease
 
20220203 jim spohrer uidp v11
20220203 jim spohrer uidp v1120220203 jim spohrer uidp v11
20220203 jim spohrer uidp v11ISSIP
 
Short and Long of Data Driven Innovation
Short and Long of Data Driven InnovationShort and Long of Data Driven Innovation
Short and Long of Data Driven InnovationDavid De Roure
 
Data urban service science 20130617 v2
Data urban service science 20130617 v2Data urban service science 20130617 v2
Data urban service science 20130617 v2ISSIP
 
New and Emerging Forms of Data
New and Emerging Forms of DataNew and Emerging Forms of Data
New and Emerging Forms of DataDavid De Roure
 
Social Machines of Science and Scholarship
Social Machines of Science and ScholarshipSocial Machines of Science and Scholarship
Social Machines of Science and ScholarshipDavid De Roure
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly ArticleDavid De Roure
 
The Computer Science Imperative
The Computer Science ImperativeThe Computer Science Imperative
The Computer Science ImperativeHal Speed
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design@cristobalcobo
 
Big Data and Social Sciences
Big Data and Social SciencesBig Data and Social Sciences
Big Data and Social SciencesDavid De Roure
 
Big Data and Social Machines
Big Data and Social MachinesBig Data and Social Machines
Big Data and Social MachinesDavid De Roure
 
Big Data Challenges for the Social Sciences
Big Data Challenges for the Social SciencesBig Data Challenges for the Social Sciences
Big Data Challenges for the Social SciencesDavid De Roure
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital WorldDavid De Roure
 
Big Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic WebBig Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic WebDavid De Roure
 
Ibm welcome and cognitive 20170322 v7
Ibm welcome and cognitive 20170322 v7Ibm welcome and cognitive 20170322 v7
Ibm welcome and cognitive 20170322 v7ISSIP
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-ResearchDavid De Roure
 

La actualidad más candente (20)

What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?
 
20220203 jim spohrer uidp v11
20220203 jim spohrer uidp v1120220203 jim spohrer uidp v11
20220203 jim spohrer uidp v11
 
Short and Long of Data Driven Innovation
Short and Long of Data Driven InnovationShort and Long of Data Driven Innovation
Short and Long of Data Driven Innovation
 
Data urban service science 20130617 v2
Data urban service science 20130617 v2Data urban service science 20130617 v2
Data urban service science 20130617 v2
 
New and Emerging Forms of Data
New and Emerging Forms of DataNew and Emerging Forms of Data
New and Emerging Forms of Data
 
Social Machines of Science and Scholarship
Social Machines of Science and ScholarshipSocial Machines of Science and Scholarship
Social Machines of Science and Scholarship
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Article
 
The Computer Science Imperative
The Computer Science ImperativeThe Computer Science Imperative
The Computer Science Imperative
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design
 
Big Data and Social Sciences
Big Data and Social SciencesBig Data and Social Sciences
Big Data and Social Sciences
 
Big Data and Social Machines
Big Data and Social MachinesBig Data and Social Machines
Big Data and Social Machines
 
Big Data Challenges for the Social Sciences
Big Data Challenges for the Social SciencesBig Data Challenges for the Social Sciences
Big Data Challenges for the Social Sciences
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital World
 
Cook et al
Cook et alCook et al
Cook et al
 
Social Machines IIIT
Social Machines IIITSocial Machines IIIT
Social Machines IIIT
 
Big Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic WebBig Data meets Big Social: Social Machines and the Semantic Web
Big Data meets Big Social: Social Machines and the Semantic Web
 
Ibm welcome and cognitive 20170322 v7
Ibm welcome and cognitive 20170322 v7Ibm welcome and cognitive 20170322 v7
Ibm welcome and cognitive 20170322 v7
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-Research
 

Similar a Crowd Computing Opportunities and Challenges

Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsMatthew Lease
 
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用台灣資料科學年會
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataCS, NcState
 
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...Leandro de Castro
 
Joy Mountford at BayCHI: Visualizations of Our Collective Lives
Joy Mountford at BayCHI: Visualizations of Our Collective LivesJoy Mountford at BayCHI: Visualizations of Our Collective Lives
Joy Mountford at BayCHI: Visualizations of Our Collective LivesBayCHI
 
Designing Useful and Usable Augmented Reality Experiences
Designing Useful and Usable Augmented Reality Experiences Designing Useful and Usable Augmented Reality Experiences
Designing Useful and Usable Augmented Reality Experiences Yan Xu
 
Quantitative Information Architecture
Quantitative Information ArchitectureQuantitative Information Architecture
Quantitative Information ArchitectureDon Turnbull
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK Project
 
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Matthew Lease
 
GTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceGTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceKürşat İNCE
 
Artificial Intelligence_Himani Patpatia.pptx
Artificial Intelligence_Himani Patpatia.pptxArtificial Intelligence_Himani Patpatia.pptx
Artificial Intelligence_Himani Patpatia.pptxHimaniPatpatia
 
Lida change-reference-abels
Lida change-reference-abelsLida change-reference-abels
Lida change-reference-abelsfpehar
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxISSIP
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
ZenonFest19may2016.key
ZenonFest19may2016.keyZenonFest19may2016.key
ZenonFest19may2016.keyBrian Fisher
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationMatthew Lease
 
許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI台灣資料科學年會
 

Similar a Crowd Computing Opportunities and Challenges (20)

Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
 
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
[DSC x TAAI 2016] 林守德 / 人工智慧與機器學習在推薦系統上的應用
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
 
Joy Mountford at BayCHI: Visualizations of Our Collective Lives
Joy Mountford at BayCHI: Visualizations of Our Collective LivesJoy Mountford at BayCHI: Visualizations of Our Collective Lives
Joy Mountford at BayCHI: Visualizations of Our Collective Lives
 
Designing Useful and Usable Augmented Reality Experiences
Designing Useful and Usable Augmented Reality Experiences Designing Useful and Usable Augmented Reality Experiences
Designing Useful and Usable Augmented Reality Experiences
 
Text Mining : Experience
Text Mining : ExperienceText Mining : Experience
Text Mining : Experience
 
Quantitative Information Architecture
Quantitative Information ArchitectureQuantitative Information Architecture
Quantitative Information Architecture
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
 
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
 
GTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial IntelligenceGTU GeekDay 2019 Limitations of Artificial Intelligence
GTU GeekDay 2019 Limitations of Artificial Intelligence
 
Artificial Intelligence_Himani Patpatia.pptx
Artificial Intelligence_Himani Patpatia.pptxArtificial Intelligence_Himani Patpatia.pptx
Artificial Intelligence_Himani Patpatia.pptx
 
Lida change-reference-abels
Lida change-reference-abelsLida change-reference-abels
Lida change-reference-abels
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptx
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
ZenonFest19may2016.key
ZenonFest19may2016.keyZenonFest19may2016.key
ZenonFest19may2016.key
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine Evaluation
 
When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!
 
許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI
 
n01.ppt
n01.pptn01.ppt
n01.ppt
 

Más de Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Matthew Lease
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd Matthew Lease
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingMatthew Lease
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkMatthew Lease
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Matthew Lease
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkMatthew Lease
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsMatthew Lease
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Matthew Lease
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMatthew Lease
 

Más de Matthew Lease (19)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject Crowdsourcing
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd Work
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences.
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not Anonymous
 

Último

Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfGale Pooley
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfGale Pooley
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spiritegoetzinger
 
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja Nehwal
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfGale Pooley
 
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services  9892124323 | ₹,4500 With Room Free DeliveryMalad Call Girl in Services  9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free DeliveryPooja Nehwal
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdfAdnet Communications
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Pooja Nehwal
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Pooja Nehwal
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfGale Pooley
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Último (20)

(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdf
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
 
The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdf
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spirit
 
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdf
 
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services  9892124323 | ₹,4500 With Room Free DeliveryMalad Call Girl in Services  9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
 
Commercial Bank Economic Capsule - April 2024
Commercial Bank Economic Capsule - April 2024Commercial Bank Economic Capsule - April 2024
Commercial Bank Economic Capsule - April 2024
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdf
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Crowd Computing Opportunities and Challenges

  • 1. Crowd Computing: Opportunities and Challenges Matt Lease School of Information University of Texas at Austin ml@ischool.utexas.edu @mattlease
  • 2. 2
  • 3. Amazon Mechanical Turk (MTurk) • Marketplace for crowd labor (microtasks) • Created in 2005 (still in “beta”) • On-demand, scalable, 24/7 global workforce 3
  • 4. 4
  • 5. 5
  • 7. Snow et al. (EMNLP 2008) • MTurk annotation for 5 Tasks – Affect recognition – Word similarity – Recognizing textual entailment – Event temporal ordering – Word sense disambiguation • 22K labels for US $26 • High agreement between consensus labels and gold-standard labels 7
  • 8. Alonso et al. (SIGIR Forum 2008) • MTurk for Information Retrieval (IR) – Judge relevance of search engine results • Many follow-on studies (design, quality, cost) 8
  • 9. Sorokin & Forsythe (CVPR 2008) • MTurk for Computer Vision • 4K labels for US $60 9
  • 11. Kittur, Chi, & Suh (CHI 2008) • MTurk for User Studies • “…make creating believable invalid responses as effortful as completing the task in good faith.” 11
  • 12. Jane saw the man with the binoculars 12
  • 13. Social & Behavioral Sciences • A Guide to Behavioral Experiments on Mechanical Turk – W. Mason and S. Suri (2010). SSRN online. • Crowdsourcing for Human Subjects Research – L. Schmidt (CrowdConf 2010) • Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk – Conley & Tosti-Kharas (2010). Academy of Management • Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data? – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5. – see also: Amazon Mechanical Turk Guide for Social Scientists 13
  • 15. Remote Usability Testing • D. Liu, R. Kuipers, M. Lease, and R. Bias (in preparation) • On-site vs. crowdsourced usability testing • Advantages – More Participants – More Diverse Participants – High Speed – Low Cost • Disadvantages – Lower Quality Feedback – Less Interaction – Greater need for quality control – Less Focused User Groups 15
  • 17. ESP Game (Games With a Purpose) L. Von Ahn and L. Dabbish (2004) 17
  • 18. 18
  • 19. reCaptcha L. von Ahn et al. (2008). In Science. 19
  • 20. Human Sensing and Monitoring • Sullivan et al. (2009). Bio. Conservation (142):10 • Keynote by Steve Kelling at ASIS&T 2011 20
  • 21. 21
  • 23. The Mechanical Turk The original, constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804) J. Pontin. Artificial Intelligence, With Help From the Humans. New York Times (March 25, 2007) 23
  • 24. The Turing Test (Alan Turing, 1950) 24
  • 25. 25
  • 26. The Turing Test (Alan Turing, 1950) 26
  • 27. What is a Computer? 27
  • 28. • What was old is new • Crowdsourcing: A New Branch of Computer Science – D.A. Grier, March 29, 2011 • Tabulating the heavens: computing the Nautical Almanac in 18th-century England – M. Croarken (2003) Princeton University Press, 2005 28
  • 29. The Human Processing Unit (HPU) • Davis et al. (2010) HPU 29
  • 30. Human Computation • Luis von Ahn (2005) • Investigates use of people to execute certain computations for which capabilities of current automated methods are more limited • Explores the metaphor of computation for characterizing attributes, capabilities, and limitations of human performance in executing desired tasks • Having people do stuff (instead of computers) 30
  • 31. Blending Automation & Human Computation 31
  • 33. Ethics Checking: The Next Frontier? • Mark Johnson’s address at ACL 2003 – Transcript in Conduit 12(2) 2003 • Think how useful a little “ethics checker and corrector” program integrated into a word processor could be! 33
  • 34. Soylent: A Word Processor with a Crowd Inside • Bernstein et al., UIST 2010 34
  • 35. Translation by monolingual speakers • C. Hu, CHI 2009 35
  • 36. fold.it S. Cooper et al. (2010) Alice G. Walton. Online Gamers Help Solve Mystery of Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011. 36
  • 37. CrowdSearch and mCrowd • T. Yan, MobiSys 2010 37
  • 38. VizWiz aaaaaaaa Bingham et al. (UIST 2010) 38
  • 39. 39
  • 40. Crowdsourcing • Jeff Howe. Wired, June 2006. • Take a job traditionally performed by a known agent (often an employee) • Outsource it to an undefined, generally large group of people via an open call • New application of principles from open source movement 40
  • 41. 41
  • 42. What is Crowdsourcing? • A collection of mechanisms and associated methodologies for scaling and directing crowd activities to achieve some goal(s) • Enabled by internet-connectivity • Many related areas – Human computation – Collective intelligence – Social computing – People services 42
  • 43. Crowdsourcing Key Questions • What are the goals? – Purposeful directing of human activity • How can you incentivize participation? – Incentive engineering – Who is your intended crowd? • Which model(s) are most appropriate? – How to adapt them to your context and goals? 43
  • 44. 44
  • 45. What about sensitive data? • Not all data can be publicly disclosed – User data (e.g. AOL query log, Netflix ratings) – Intellectual property – Legal confidentiality • Need to restrict who is in your crowd – Separate channel (workforce) from technology – Hot question for adoption at enterprise level 45
  • 46. What about the law? • Wolfson & Lease (ASIS&T 2011) • As usual, technology is ahead of the law – employment law – patent inventorship – data security and the Federal Trade Commission – copyright ownership – securities regulation of crowdfunding • Take-away: don’t panic, but be mindful – Understand risks of “just in-time compliance” 46
  • 47. Nature of Micro-tasks • Small, simple tasks which can be completed without extraneous detail or context – e.g. “Can you name who is in this photo?” • Variety of current research investigating how to decompose complex tasks into simpler ones 47
  • 49. What about trust? • Some reports of robot “workers” on MTurk – E.g. McCreadie et al. (2011) – Violates terms of service • Why not just use a captcha? 49
  • 51. Requester Fraud on MTurk “Do not do any HITs that involve: filling in CAPTCHAs; secret shopping; test our web page; test zip code; free trial; click my link; surveys or quizzes (unless the requester is listed with a smiley in the Hall of Fame/Shame); anything that involves sending a text message; or basically anything that asks for any personal information at all—even your zip code. If you feel in your gut it’s not on the level, IT’S NOT. Why? Because they are scams...” 51
  • 52. Who are the workers? • A. Baio, November 2008. The Faces of Mechanical Turk. • P. Ipeirotis. March 2010. The New Demographics of Mechanical Turk • J. Ross, et al. Who are the Crowdworkers?... CHI 2010. 52
  • 53. What about ethics? • Silberman, Irani, and Ross (2010) – “How should we… conceptualize the role of these people who we ask to power our computing?” – Power dynamics between parties • What are the consequences for a worker when your actions harm their reputation? – “Abstraction hides detail” • Fort, Adda, and Cohen (2011) – “…opportunities for our community to deliberately value ethics above cost savings.” 53
  • 54. Davis et al. (2010) The HPU. HPU 54
  • 55. HPU: “Abstraction hides detail” • Not just turning a mechanical crank 55
  • 56. 56
  • 58. What about quality? • Many papers on statistical methods for MTurk – Worker calibration, noise vs. bias, weighted voting – Checking consensus may discourage worker honesty • Garbage in = garbage out – Only as strong as your weakest link (end-to-end) – Is your shiny statistical hammer what’s really needed? – Not all problems are technological • Methods for consistent annotation still apply 58
  • 59. What about quality? (2) • Human factors matter – Part of your experimental design (study & report) – Instructions, design, interface, interaction – Names, relationship, reputation (Klinger & Lease 2011) – Fair pay, hourly vs. per-task, recognition, advancement – For contrast, consider Kochhar (2010) • How good is gold really? – Klebanov and Beigman (NAACL 2010) – Model uncertainty in ground truth • Training and evaluation with uncertain labels • Temporal and label uncertainty in active learning 59
  • 60. Need for Benchmarks • How well do different crowdsourcing methods perform on comparable data? – Shared datasets used to bring people together • Common tasks to consider – Translation – Transcription – Relevance Judging (search engines) – Generation (resources, SEO, reputation) – Verification (correct, copyright, appropriate) • NIST TREC Crowdsourcing Track exploring this – Track will run for 2nd Year in 2012 60
  • 62. Why Eytan Adar hates MTurk Research (CHI 2011 CHC Workshop) • Overly-narrow focus on MTurk – Identify general vs. platform-specific problems – Academic vs. Industrial problems • Inattention to prior work in other disciplines – Some problems well-studied in other areas – Human behavior hasn’t changed much • Turks aren’t Martians – How many prior user studies need to be reproduced on MTurk before we believe it? 62
  • 63. Many choices beyond MTurk • Clickworker Industry heavy-weights • CloudCrowd Elance • CloudFactory Liveops • CrowdSource oDesk • DoMyStuff uTest • Humanoid • Microtask And More! JobBoy, microWorkers, MiniFreelance, • MobileWorks MiniJobz, MinuteWorkers, MyEasyTask, OpTask, ShortTask, SimpleWorkers • myGengo • SmartSheet • vWorker 63
  • 64. 64
  • 65. Run-time automation for managing distributed crowd computing • Kittur et al. (CHI 2011), CrowdForge 65
  • 66. MapReduce for human computation? • Large task divided into smaller sub-problems • Job distributed among multiple workers • Collect all answers and combine them • Varying performance of heterogeneous CPUs/HPUs 66
  • 67. Wisdom of Crowds Computing Pre-conditions • Diversity • Independence • Decentralization • Aggregation Input: large, diverse sample (increases likelihood of overall pool quality) Output: consensus, selection, distribution What about ensemble techniques and theory we have for integrating multiple noisy models? Computational Social Science & the Wisdom of Crowds NIPS 2010-2011 workshops 67
  • 68. • How to effectively compute in the crowd? – What are all the knobs and dials? – How to set them to navigate quality vs. cost vs. time? • How do we design, implement, test, & maintain crowd computing systems? • What new capabilities can such systems provide? • How does cheaper / faster / easier / noisier data change the way we build intelligent systems? – Relative costs of all other activities increase 68
  • 69. Unreasonable Effectiveness of Data • Massive free Web data changed how we train learning systems – Banko and Brill (2001). Human Language Tech. – Halevy et al. (2009). IEEE Intelligent Systems. • How might access to cheap & plentiful labeled data change the balance again? 69
  • 70. • Who is the right person for the job? – Requesting or inferring skills / experience – Interactive task selection or automatic routing – How to represent, measure, model, estimate, and utilize individual worker expertise/accuracy? • How should crowd computing evolve from here? – What should next generation infrastructure provide? – What are the right programming constructs? • With automation and human computation, who does what? Mixed-initiative thinking. 70
  • 71. Crowdsourcing in 2012 • Conferences and Workshops – AAAI Symposium: Wisdom of the Crowd (March 26-28) – Collective Intelligence (papers: Nov 18, date: April 18-20) – Year 2 of TREC Crowdsourcing Track – HComp and CrowdConf (details TBD) • Journal Special Issues – Springer’s Information Retrieval: Crowdsourcing for Information Retrieval – Hindawi’s Advances in Multimedia Journal: Multimedia Semantics Analysis via Crowdsourcing Geocontext – IEEE Internet Computing: Crowdsourcing (Sept./Oct. 2012) – IEEE Transactions on Multimedia: Crowdsourcing in Multimedia (proposal in review) • Places for News – Follow the Crowd – The Crowdsortium 71
  • 72. 2011 Tutorials and Keynotes • By Omar Alonso and/or Matthew Lease – CLEF: Crowdsourcing for Information Retrieval Experimentation and Evaluation (Sep. 20, Omar only) – CrowdConf (Nov. 1, this is it!) – IJCNLP: Crowd Computing: Opportunities and Challenges (Nov. 10, Matt only) – WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You (Feb. 9) – SIGIR: Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (July 24) • AAAI: Human Computation: Core Research Questions and State of the Art – Edith Law and Luis von Ahn, August 7 • ASIS&T: How to Identify Ducks In Flight: A Crowdsourcing Approach to Biodiversity Research and Conservation – Steve Kelling, October 10, ebird • EC: Conducting Behavioral Research Using Amazon's Mechanical Turk – Winter Mason and Siddharth Suri, June 5 • HCIC: Quality Crowdsourcing for Human Computer Interaction Research – Ed Chi, June 14-18, about HCIC) – Also see his: Crowdsourcing for HCI Research with Amazon Mechanical Turk • Multimedia: Frontiers in Multimedia Search – Alan Hanjalic and Martha Larson, Nov 28 • VLDB: Crowdsourcing Applications and Platforms – Anhai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska) • WWW: Managing Crowdsourced Human Computation – Panos Ipeirotis and Praveen Paritosh 72
  • 73. 2011 Workshops & Conferences • AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8) • ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2) • Crowdsourcing Technologies for Language and Cognition Studies (July 27) • CHI-CHC: Crowdsourcing and Human Computation (May 8) • CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”) • CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2) • Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13) • EC: Workshop on Social Computing and User Generated Content (June 5) • ICWE: 2nd International Workshop on Enterprise Crowdsourcing (June 20) • Interspeech: Crowdsourcing for speech processing (August) • NIPS: Second Workshop on Computational Social Science and the Wisdom of Crowds (Dec. TBD) • SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28) • TREC-Crowd: Year 1 of TREC Crowdsourcing Track (Nov. 16-18) • UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18) • WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9) 73
  • 74. Recent Overview Papers • Alex Quinn and Ben Bederson. Human Computation: A Survey and Taxonomy of a Growing Field. CHI 2011. • Man-Ching Yuen, Irwin King, and Kwong-Sak Leung. A Survey of Crowdsourcing Systems. SocialCom 2011. • Rajarshi Das and Maja Vukovic. Emerging theories and models of human computation systems: a brief survey. UbiCrowd 2011 • A. Doan, R. Ramakrishnan, A. Halevy. Crowdsourcing Systems on the World-Wide Web. Communications of the ACM, 2011. 74
  • 75. Books • Omar Alonso, Gabriella Kazai, and Stefano Mizzaro. (2012). Crowdsourcing for Search Engine Evaluation: Why and How. • Law and von Ahn (2011). Human Computation 75
  • 76. More Books July 2010, kindle-only: “This book introduces you to the top crowdsourcing sites and outlines step by step with photos the exact process to get started as a requester on Amazon Mechanical Turk.“ 76
  • 77. Thank You! ir.ischool.utexas.edu/crowd • Students – Catherine Grady (iSchool) Matt Lease – Hyunjoon Jung (ECE) ml@ischool.utexas.edu – Jorn Klinger (Linguistics) @mattlease – Adriana Kovashka (CS) – Abhimanu Kumar (CS) – Di Liu (iSchool) – Hohyon Ryu (iSchool) – William Tang (CS) – Stephen Wolfson (iSchool) • Omar Alonso, Microsoft Bing • Support – John P. Commons 77
  • 78. Bibliography  J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.  Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.  Bederson, B.B., Hu, C., & Resnik, P. Translation by Interactive Collaboration between Monolingual Users, Proceedings of Graphics Interface (GI 2010), 39-46.  N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.  C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.  P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.  J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human in the Loop (ACVHL), June 2010.  M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.  D. A. Grier. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579  JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.  J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.  P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.  J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.  P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT Workshop on Active Learning and NLP, 2009.  B. Huberman, D. Romero, and F. Wu. “Crowdsourcing, attention and productivity”. Journal of Information Science, 2009.  P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.  P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.  P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010) 78
  • 79. Bibliography (2)  A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.  Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011  Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.  K. Krippendorff. "Content Analysis", Sage Publications, 2003  G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.  T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence. 2009.  W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.  J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.  A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009  J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting Demographics in Amazon Mechanical Turk”. CHI 2010.  F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.  R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks”. EMNLP-2008.  V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers” KDD 2008.  S. Weber. “The Success of Open Source”, Harvard University Press, 2004.  L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.  L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008. 79
  • 80. Bibliography (3)  Shuo Chen et al. What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and Clustering on Teachers. AAAI 2010.  Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011.  Florian Laws, Christian Scheible and Hinrich Schütze. Active Learning with Amazon Mechanical Turk. EMNLP 2011.  C.Y. Lin. Rouge: A package for automatic evaluation of summaries. Proceedings of the workshop on text summarization branches out (WAS), 2004.  C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.  Hohyon Ryu and Matthew Lease. Crowdworker Filtering with Support Vector Machine. ASIS&T 2011.  Wei Tang and Matthew Lease. Semi-Supervised Consensus Labeling for Crowdsourcing. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR), 2011.  S. Vijayanarasimhan and K. Grauman. Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. CVPR 2011.  Stephen Wolfson and Matthew Lease. Look Before You Leap: Legal Pitfalls of Crowdsourcing. ASIS&T 2011. 80
  • 81. Crowdsourcing in IR: 2008-2010  2008  O. Alonso, D. Rose, and B. Stewart. “Crowdsourcing for relevance evaluation”, SIGIR Forum, Vol. 42, No. 2.  2009  O. Alonso and S. Mizzaro. “Can we get rid of TREC Assessors? Using Mechanical Turk for … Assessment”. SIGIR Workshop on the Future of IR Evaluation.  P.N. Bennett, D.M. Chickering, A. Mityagin. Learning Consensus Opinion: Mining Data from a Labeling Game. WWW.  G. Kazai, N. Milic-Frayling, and J. Costello. “Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments”, SIGIR.  G. Kazai and N. Milic-Frayling. “… Quality of Relevance Assessments Collected through Crowdsourcing”. SIGIR Workshop on the Future of IR Evaluation.  Law et al. “SearchWar”. HCOMP.  H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. “Improving Search Engines Using Human Computation Games”, CIKM 2009.  2010  SIGIR Workshop on Crowdsourcing for Search Evaluation.  O. Alonso, R. Schenkel, and M. Theobald. “Crowdsourcing Assessments for XML Ranked Retrieval”, ECIR.  K. Berberich, S. Bedathur, O. Alonso, G. Weikum “A Language Modeling Approach for Temporal Information Needs”, ECIR.  C. Grady and M. Lease. “Crowdsourcing Document Relevance Assessment with Mechanical Turk”. NAACL HLT Workshop on … Amazon's Mechanical Turk.  Grace Hui Yang, Anton Mityagin, Krysta M. Svore, and Sergey Markov . “Collecting High Quality Overlapping Labels at Low Cost”. SIGIR.  G. Kazai. “An Exploration of the Influence that Task Parameters Have on the Performance of Crowds”. CrowdConf.  G. Kazai. “… Crowdsourcing in Building an Evaluation Platform for Searching Collections of Digitized Books”., Workshop on Very Large Digital Libraries (VLDL)  Stephanie Nowak and Stefan Ruger. How Reliable are Annotations via Crowdsourcing? MIR.  Jean-François Paiement, Dr. James G. Shanahan, and Remi Zajac. “Crowdsourcing Local Search Relevance”. CrowdConf.  Maria Stone and Omar Alonso. “A Comparison of On-Demand Workforce with Trained Judges for Web Search Relevance Evaluation”. CrowdConf.  T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys pp. 77--90, 2010. 81
  • 82. Crowdsourcing in IR: 2011  WSDM Workshop on Crowdsourcing for Search and Data Mining.  SIGIR Workshop on Crowdsourcing for Information Retrieval.  O. Alonso and R. Baeza-Yates. “Design and Implementation of Relevance Assessments using Crowdsourcing, ECIR 2011.  Roi Blanco, Harry Halpin, Daniel Herzig, Peter Mika, Jeffrey Pound, Henry Thompson, Thanh D. Tran. “Repeatable and Reliable Search System Evaluation using Crowd-Sourcing”. SIGIR 2011.  Yen-Ta Huang, An-Jung Cheng, Liang-Chi Hsieh, Winston H. Hsu, Kuo-Wei Chang. “Region-Based Landmark Discovery by Crowdsourcing Geo-Referenced Photos.” SIGIR 2011.  Hyun Joon Jung, Matthew Lease . “Improving Consensus Accuracy via Z-score and Weighted Voting”. HCOMP 2011.  G. Kasneci, J. Van Gael, D. Stern, and T. Graepel, CoBayes: Bayesian Knowledge Corroboration with Assessors of Unknown Areas of Expertise, WSDM 2011.  Gabriella Kazai,. “In Search of Quality in Crowdsourcing for Search Engine Evaluation”, ECIR 2011.  Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling. “Crowdsourcing for Book Search Evaluation: Impact of Quality on Comparative System Ranking.” SIGIR 2011.  Abhimanu Kumar, Matthew Lease . “Learning to Rank From a Noisy Crowd”. SIGIR 2011.  Edith Law, Paul N. Bennett, and Eric Horvitz. “The Effects of Choice in Routing Relevance Judgments”. SIGIR 2011. 82