Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose

presentation of my paper at EKAW 2018 in Nancy - How to take multiple factors into account when evaluating a Game with a Purpose? How is player behaviour or participation influenced by different incentives? How does player engagement impact their accuracy in solving tasks? In this paper, we present a detailed investigation of multiple factors affecting the evaluation of a GWAP and we show how they impact on the achieved results. We inform our study with the experimental assessment of a GWAP designed to solve a multinomial classification task.

  • Inicia sesión para ver los comentarios

  • Sé el primero en recomendar esto

Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose

  1. 1. Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose Gloria Re Calegari and Irene Celino Nancy, November 15th, 2018 – 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018)
  2. 2. HUMAN-IN-THE-LOOP FOR KNOWLEDGE ACQUISITION • Machine learning approaches train automatic models on the basis of a training set, thus they require some partial gold standard, often also named “ground truth” • Ground truth requires putting back the human in the loop: building a training set for a machine learning pipeline means asking people to execute a set of tasks • This knowledge acquisition challenge is usually solved in one of the following ways: • Asking experts to put together the training set (but involving experts can be expensive!) • Adopting Crowdsourcing and Human Computation approaches, thus asking to a distributed crowd to collect the required knowledge Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 2
  3. 3. • Crowdsourcing and Human Computation approaches have been largely adopted for several knowledge management tasks: collection, enrichment, validation, annotation, ranking, … • Those approaches differ in engagement and reward schemes for human participants • What are the condition that make it worth adopting a GWAP approach? • When and how are GWAPs effective to achieve their goal? • Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people (notable examples: Amazon Mechanical Turk, Figure Eight) • Human Computation is a computer science technique in which a computational process is performed by outsourcing certain steps to humans, usually when humans are very good at solving those tasks while computers are not (notable example: reCAPTCHA) • Games with a Purpose (GWAP) are a Human Computation application that lets to outsource some tasks to humans in an entertaining way (notable example: the ESP game) CROWDSOURCING & HUMAN COMPUTATION Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 3 premium access money prizes knowledge recognition fun enjoyment
  4. 4. • Input: set of pictures and classification categories • Goal: associate a category to each picture by assigning a score 𝜎 to each picture-category pair • Score 𝜎 of each picture- category association is updated on the basis of players’ choices • When the score of a picture- category pair overcomes the threshold 𝜎 ≥ 𝑡 , the association is considered “true” (and the picture is removed from the game) • Purpose: identify pictures of cities from above between those taken on board of the ISS (the pictures are used then in a scientific process in light pollution research) USE CASE: THE NIGHT KNIGHTS GWAP Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 http://nightknights.eu DATA COLLECTION & VALIDATION Pure GWAP with not-so-hidden purpose (but played by anybody) Points, badges, leaderboard as intrinsic reward A player scores if he/she agrees with another player “Bonus” intrinsic reward with NASA pictures! Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, vol. 5, issue 1, 2018. Gloria Re Calegari, Andrea Fiano and Irene Celino: A Framework to build Games with a Purpose for Linked Data Refinement, in proceedings of ISWC 2018, LNCS Volume 11137, pp. 154-169. 4
  5. 5. NIGHT KNIGHTS: DATA AND EVALUATION • Reference observation period: 9 months (February-October 2017) • 1 month of competition with tangible reward (join the 2017 Summer Expetition to observe the Solar Eclipse in USA) in June-July 2017 • 4 months from the game launch to the competition start + 4 months after the competition • Data available at https://github.com/STARS4ALL/Night-Knights-dataset • ~ 650 players and ~ 28.000 classified pictures • Released under a Creative Commons 4.0 license • Investigation to analyse participation and find profile patterns • Standard GWAP metrics • Citizen Science metrics • Influence of different factors, including incentives, playing style, task difficulty, … 5Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  6. 6. [Q1] HOW DO PARTICIPATION AND RESULTS CHANGE WITH INCENTIVES? [Q2] DO THE EXTRINSIC REWARD EFFECTS LAST OVER TIME? [Q1] • A tangible reward has a clear effect on participation • There is a statistically significant difference between competition and non-competition periods in all evaluation metrics (throughput, average life play, expected contribution) [Q2] • The incentive effect doesn’t seem to last: there is no statistically significant difference between the pre-competition and the post-competition periods • The overlaps between the set of players in the different periods are very limited (<10%) Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 Before During After Time span (months) 4 1 4 Classified images 1,830 24,600 1,300 Contributions 13,000 187,600 3,600 Users 285 174 174 Total play time (hours) 65 471 29 Throughput (tasks/hour) 69 212 113 ALP (mins/player) 5.5 65 4 EC (tasks/user) 6.4 141 7.5 6
  7. 7. [Q3] DOES PLAYING STYLE CHANGE WITH THE INCENTIVE? • Contribution speed = number of images played in each game round • Estimation: 3-5 seconds/photo, 1 min round  ~ 15 images/round • During the competition (extrinsic motivation) • Normal distribution centred around 15 pictures/round • Players tried to classify as many picture as possible • Before and after the competition (intrinsic motivation) • Almost flat distribution with median < 10 images/round • Players adopted a more “relaxed” playing style 7Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  8. 8. [Q4] HOW DO GWAPS COMPARE TO TRADITIONAL CITIZEN SCIENCE? [Q5] WHAT DOES PLAYER BEHAVIOUR TELL ABOUT THE GAME NATURE? [Q4] • Engagement metrics • From Citizen Science literature: activity ratio (AR, % active days), daily devoted time (DDT, in hours), relative active duration (RAD, wrt reference period), variation in periodicity (VIP, std of intervals between active days) • Players show very different behaviour: • 2-3 times higher AR, consistently higher DDT and RAD • Significantly lower VIP • Clustering leads to 90% group of hardworkers (high AR and low VIP), other Citizen Science behaviour not observed [Q5] • Casual game, because of total active time (last – first round) • 75% of players played for less than 5 minutes • 10% of players played for more than 1 day 8Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 NK (global) NK (compet.) MW (*) GZ (*) WI (**) AR 0.96 0.95 0.40 0.33 0.32 DDT 0.68 1.80 0.44 0.32 - RAD - 0.54 0.20 0.23 0.43 VIP 14.53 2.53 18.27 25.23 5.11 Citizen Science campaigns from reference literature: * Ponciano, L., Brasileiro, F.: Finding volunteers’ engagement profiles in human computation for citizen science projects. Human Computation Journal, 2015 ** Aristeidou, M., Scanlon, E., Sharples, M.: Profiles of engagement in online communities of citizen science participation. Computers in Human Behavior, 2017
  9. 9. [Q6] WHAT KIND OF GWAP PLAYER PROFILES CAN BE IDENTIFIED? • Player accuracy = how many tasks each player correctly solved over the total number of tasks he played with (correct wrt aggregated solution) • Player participation = total number of contributions given by player • Threshold on accuracy axis  accurate / inaccurate player distinction • Threshold on participation axis  casual / frequent player distinction • Four different player profiles: • Beginners (low participation, low accuracy) • Snipers (low participation, high accuracy) • Champions (high participation, high accuracy) • Trolls (high participation, low accuracy) • Distribution of contributions across profiles: 9Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 Beginners Snipers Champions Trolls Contributions 0.7% 0.4% 95.9% 3.0%
  10. 10. [Q7] DOES PLAYER BEHAVIOUR CHANGE WITH DIFFERENT INCENTIVES? • During competition (extrinsic motivation period) • Majority of champions (high participation, high accuracy)  maybe learning effect? • Higher average accuracy (statistically significant difference) for both casual and frequent players (7% improvement in both cases)  higher attention brings higher quality • Before/after competition (intrinsic motivation period) • (Relative) majority of beginners (low participation, low accuracy)  maybe due to curiosity or “first try” • Higher variability of accuracy values (height of boxplots) • In all periods: limited number of trolls, and always majority of accurate players (snipers+champions, 64%) 10Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  11. 11. [Q8] DOES PLAYER BEHAVIOUR CHANGE WITH TASK DIFFICULTY? [Q9] DOES PLAYER BEHAVIOUR CHANGE WITH TASK VARIETY? • Task difficulty = number of different users needed to solve the task (i.e. to find an agreement by aggregating user contributions) • Easy tasks: 4 users (minimum by design), 58% of all tasks • Difficult tasks: 5 to 17 users • Accuracy variability with task difficulty • No difference between casual and frequent players on easy tasks • Statistically significant difference between casual and frequent players on difficult tasks  learning effect (the more they play, the higher the accuracy) • Accuracy variability with task variety (different classes) • Some classes are indeed “more difficult” than others • No difference between casual and frequent players across classes  indeed anybody can be a classifier (no expert knowledge required) 11Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018
  12. 12. CONCLUSIONS • GWAPs are an effective “human in the loop” method to engage a target community in a process of knowledge management (e.g. to collect a large enough training set for machine learning) • Still they are less explored and evaluated among Human Computation approaches • Investigation of the interplay of different factors in GWAP evaluation • Game incentives, player participation profiles, task difficulty, … • A framework to analyse a GWAP and assess the effectiveness of your target community involvement in knowledge acquisition and management • Quantitative results are specific of the analysed game, but completely replicable approach • A method to identify strengths and weaknesses of a GWAP and to plan improvements Interplay of Game Incentives, Player Profiles and Task Difficulty in GWAPs - EKAW 2018 12
  13. 13. MILANO viale Sarca 226, 20126, Milano - Italy LONDON 4° floor 57 Rathbone Place London W1T 1JU – UK NEW YORK One Liberty Plaza, 165 Broadway, 23rd Floor, New York City, New York, 10006 USA Cefriel.com Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose Gloria Re Calegari and Irene Celino This work was partially supported by the STARS4ALL project (H2020-688135) co-funded by the European Commission Icons made by Eucalyp from www.flaticon.com Contact me: Irene Celino Head of Knowledge Technologies Group Cefriel - Politecnico di Milano irene.celino@cefriel.com iricelino.org

×