This document describes steps for designing effective incentive systems through experiments and mechanism design. It discusses analyzing user preferences, formalizing existing rewards, designing simple testable hypotheses, and fine-tuning systems. An experiment with students tagging images found that a "winner takes all" incentive of a single cash prize produced more and higher quality tags than per-click compensation. Next steps involve making the task more realistic and useful for workers to further refine the incentive system.
1. Hands-on experiences with
incentives and mechanism design
www.insemtives.eu 1
Roberta Cuel, University of Trento, IT; Markus
Rohde, University of Siegen, DE and
Germán Toro del Valle, Telefonica I+D, ES
ISWC 2010
2. How to design effective
incentives/rules
1. Analyze the domain
• What?
• Working environment
• Job descriptions
• Organization (tasks, hierarchy, compensation, social,
communication)
• How?
• Qualitative face-to-face interviews and questionnaires
• Observations with selected individuals
• Quantitative analysis (data collections)
1/30/2015 www.insemtives.eu 2
3. How to design effective
incentives/rules (2)
2. Identify the preferences and motivations that drive
users
Concentrate on every-day uses for those specific users
3. Formalize the existing reward system
Find yourself in the matrix
4. Design the simplest possible solution that can
effectively support those uses
Translate into a small number of alternative testable
hypothesis
5. Fine tune the rewarding system
1/30/2015 www.insemtives.eu 3
4. Fine-tuning incentives with
mechanism design: a step by step
procedure
• Mimic situation in the lab
• Set up experiment as close to real life situation as
possible
• Run experiment with volunteer subjects, with
random allocation of subjects
• Test alternative hypothesis regarding effect of
incentive schemes on behavior
• Check differences in outcome
If happy go to next slide,
otherwise re-design hypothesis and run a new trial.
5. Fine-tuning part II
• Start adding realism components:
– Move to real subjects (field test)
– Move to real tasks (with real subjects)
– Move to real subjects handling real tasks
– Move to real situation (field experiment)
• During the process you:
– Lose control over ability to manipulate variables
– Gain awareness of interaction between variables
• Let’s look at what we are doing with case studies!
6. Telefonica I+D case study
• Corporate portal
• What is the most obvious incentive from
economic point of view?
• What can we do with a small budget to be
dedicated to incentivize users?
• How do we know which system is the best for
our setting?
1/30/2015 www.insemtives.eu 6
7. Basic experiment
• Test Two rewarding/incentives systems
• Pay per click:
– 0,03 € per tag added (up to 3 € maximum).
• Winner takes all model:
– The person who adds the higher number of
tags/annotation wins 20€
What would you choose?
(Participation fee – 5 €)
1/30/2015 www.insemtives.eu 7
8. The experiment (setting)
• 36 students
– Random assignment to the two “treatments”
• Individual task: annotation of images
• Clear set of Instructions
• Training (guided) session to give basic
understanding of annotation tool
• 8 minutes clocked session (time pressure)
• Goal: produce maximum amount of tags in
allotted time on a random set of images
1/30/2015 www.insemtives.eu 8
16. Number of tags
Pay per tag
(N=19)
– Total amount of tags: 901
– Max n. of tags: 78
– N. tags (avg.)= 47.42
– € (avg. per person)= 6.66
– € (avg. per tag) =0.1404
– € total = 126,5 € (31,5 €
flexible compensation)
Winner takes all model
(N=17)
– Total amount of tags: 1067
– Max n. of tags: 96
– N. tags (avg.)= 62.76 (32%
increase!)
– € (avg. per person)= 6.18
– € (avg. per tag)=0.098407
– € total = 105 € (20 €
flexible compensation)
1/30/2015 www.insemtives.eu 16
18. Tags distribution: interface matters!
Pay per tag
• Tag “nature” 24 times
• “snow” 22 times
• “green” 20 times
• …
• 134 tags repeated only 2
times
• 437 unique tags
Winner take all
• Tag “green” 18 times
• “snow” 14 times
• “butterfly” 13 times
• …
• 118 tags repeated only
2 times
• 390 unique tags
1/30/2015 www.insemtives.eu 18
19. Some biases
• Students are
– Volunteers who are used participating in
experiments
– Strong web users and game players
– Paid to show up
• Quality of the tags
– Quality of tagging has been controlled for: no
obvious ‘mistakes’ or ‘cheating’
1/30/2015 www.insemtives.eu
19
20. Summary of results & next lab steps
• Basic hypothesis confirmed
• More work needed:
– Effort directed to producing a good (tags) that are
not consumed by users (used to achieve other
goals) change structure of the game to let users
exploit tagging to achieve results (treasure hunt!)
– Re-run experiment with new structure. Now users
produce tags to get money and to use tags to
perform more tasks)
21. Next steps: Telefonica I+D
1/30/2015 www.insemtives.eu 21
• Replicate experiment with real users
– Main change 1: task becomes relevant in terms of
practical usefulness for users
– Main change 2: task has social implications
– Main change 3: expectations change dramatically
(workers vs. students 5 Euros to participate???)
• Add realism
– Mimic social structure in the company:
• Run experiment with teammates
• Use real tasks
• Try alternative pay for performance schemes
Notas del editor
What is the most obvious incentive for people from economic point of view? – MONEY!!!!!!!
How much money an average project in semantic annotation normally has to dedicate to incentivize users? – Little
What can we do with a little budget to be dedicated to incentivize users? – Give a little bit to every user or you can keep the whole sum and assign it to the most productive user, other
How do we know which system is best for us? – We need to test it
In the context of Telefonica use case, corporate environment, we test it with the help of experiment.
We start with the laboratory experiment with University students
Doing the experiment in experimental laboratory requires us to provide incentives to students to participate. They are not our friends that give us a favor to test the software or students in our course that earn course credit – it is not what we want. We want subjects that are neutral to us and to the task and that will reply only to the incentive structure that we provide.
There is a fee that we need to pay to students no matter what just to maintain reputation of the laboratory to be sure that students will keep coming to the experiments organized also by other researchers. You can think of environments where you can run the test and don’t pay the participation fee but offer only the flexible part (Mechanicle Turk?)
Don’t look at the €5 but concentrate on the flexible part of the payment
36 students – randomly assigned, no previous experience required, no knowledge of the tool
We used the Telefonica annotation application to annotate images
Notice, however, that for the first testing of our incentive system we do not really need the real tool. We can also do it with some other task that can be percieved as similar in terms of effort.
For payment requirements we had to round rewards in pay per tag treatment to the highest 50 cents (without rounding the budget was 27 euro)
With a budget of € 20 we get a 32% increase in the average number of annotations per person compared to a €31,5 budget (don’t look at totals!!! – we have different number of subjects in the sessions!)
Notice that Winner takes all is realtively scale free.
We could have had 40-50 subjects in the session(we had only 17 subjects due to constraints in the laboratory) and obtaine similar results with the same budget. While in the pay per tag scenario for this number of subjects our budget should be at least €60-€75.
T-student = 2.58, p-value = 0.0089 assuming unequal variance
F-test for the significance of the difference between the Variances of the two samples F=2.34, p-value = 0,042
This experiment is based on the previous experience of Telefonica: failures
The amount of money they gave to employees in ridicolous 300 € for blog entires?
More work: the task of subjects in the experiment was just to annotate images. They didn’t care why and they didn’t need annotations for anything.
In a real setting of Telefonica annotations are needed to users for their work and work of their collegues. We are in the setting of pubblic good: when you upload a document and annotate it (or finally find the document you were looking for) you need to spend time to do the good annotation, time that you can spend doing your job for which you are actually paid for. They are happy if the document is annotated so that they safe time when look for a document but they don’t want to annotate it themselves, to safe time.
The next experiment should incorporate these features of the problem – annotations should be useful to the user for performing the main task, where the main task is not annotations
Corporate portal is a specific case. Here we are not only concerned with the fact that people need to produce annotations but also with the fact that they don’t need to dedicate too much time to annotating. The problem here is to find the “right” set of incentives that makes users to produce the “right” amount of annotations. Here the mechanism design comes to help us. We model our problem and solve it to define incentive structure. Then we test the structure in the laboratory with the task that permits annotations to help with the main task