2024: Domino Containers - The Next Step. News from the Domino Container commu...
Ei09 Thousands Observers
1. Thousands of Online
Observers is Just the
Beginning
Nathan Moroney, HP Labs
Human Vision and Electronic Imaging XIV
Session 2: Social Software, Internet Experiments and New Paradigms for the Web
Monday, January 19, 2009, 1:00-1:30 PM
2. Outline
• Brief History of Crowd-Sourcing
• Online Experiments
− Unconstrained color naming
− Color name comparison
− Color difference description
− Image quality description
− World Wide Gamma
• Online Tools
− Color Thesaurus, Color Zeitgeist & Italian Color Thesaurus
• Eight considerations
1/27/2009 2
3. Brief History of Crowdsourcing: Part 1
“Since the beginning, it was
just the same. The only
difference, the crowds are
bigger now.”
Elvis
1/27/2009 3
4. Brief History of Crowdsourcing: Part 2
“The future belongs to crowds.”
Mao II
Don Delillo
(Left as an exercise for the audience to do an Elvis – Delillo mash-up)
1/27/2009 4
5. Online Experiments
• Basic pieces
− Experimental design – unconstrained text
− Software, a server – JavaScript
− Communication network –World Wide Web
− Participants - volunteers
• Results
− Direct Data
− Usage Data
• Optional but useful – lab data for validation
1/27/2009 5
6. Unconstrained Color Naming
• Seven colored patches
• Randomly selected
− 6x6x6 RGB sampling
• Text field for names
• Provide the “best” name
• Optional comments
• Started in 2002
1/27/2009 6
7. On-Line vs. Berlin & Kay
CIECAM02Hue Angle
CIECAM02 hue angle
y = 0.9971x + 28.986
360
2
R = 0.9859
270
Berlin & Kay
180
90
0
0 90 180 270 360
On-Line
Web
1/27/2009 7
8. Color Name Comparison
• Text only
• Eleven color names
• Non-repeating random
walk
• Eleven triads
− Which color is least like the
other two?
• Collect
additional
demographic data
1/27/2009 8
10. Color Difference Description
• Five pairs of colored patches.
• Best describe the difference
• Text field per pair
− Unconstrained description
• Randomly sample RGB cube
− Constrained RGB offsets
1/27/2009 10
11. Frequencies of Words
0.048 right
0.045 more
0.031 left
is six times as frequent
• ‘More’
0.028 one
0.018 color
as ‘less’ 0.017 green
0.017 darker
• ‘Darker’ is twice as frequent
0.015 blue
0.012 than
as ‘lighter’, 0.012 saturated
0.011 patch
− same for ‘dark’ and ‘light’ 0.011 first
0.010 purple
• Lime and magenta are not in
0.009 lighter
0.009 second
the top 100 terms – 0.008 dark
0.007 less
− But they are in the top 10 of 0.007 brown
unconstrained naming. 0.007 red
0.006 different
0.006 yellow
0.006 difference
0.006 brighter
0.006 hue
0.005 pink
1/27/2009 11
12. Image Quality Description
Overall and specific
•
description of image quality
Demographic questions
•
Proportion vs. Token
0.089 the
0.033 of
0.032 is
0.031 and
color(s)
0.021
0.017 to
0.016 good
0.014 on
0.014 a
0.013 in
1/27/2009 12
13. Opt-In Demographics: n=338
Non-Native
Male 35%
44% Female
Native
56%
English
Gender 65%
Proficiency
Maybe
>60
1% Color Blind
40-60 < 20
1% Don’t Know 9%
Definitely
17% 1%
23%
Color Blind
Color
Age Vision
(years) (self-described)
59% 89%
Normal
20-40
1/27/2009 13
14. World Wide Gamma
• Lightness
partitioning task, benchmark to a nominal
display and existing lightness scales, such as L*.
After
Before
1/27/2009 14
15. World Wide Gamma
• Red is >600
participants
• Black is current
results
• Specific
experimental
feedback
• Offsetfor darkest
levels but quite
linear
1/27/2009 15
16. Online Color Thesaurus
• Interface to the underlying database of color names
• Largest number of users
1/27/2009 16
17. Color Zeitgeist
• Usage data – tools use creates data which in turn
creates another tool
1/27/2009 17
18. Italian Color Thesaurus
• Italian data < English data
• Adaptive tools
− Qualification through ratings
− Quantity through instance-
based harvesting, collect new
data only for missing colors
1/27/2009 18
19. Consideration 1: Scale
• Yes online experiments mean bigger crowds
− Larger & more diverse pool of possible participants
− Logarithmic scale of participation
Stanford
HP Palo San
HP
Department California
(under)
Labs Alto Jose
1 10 100 1K 10K 100K 1M 10M 100M
English Application OS
Lab Color
Web-based Based Based
Prototypes & Thesaurus
Color naming Color Color
Experiments
experiment Picker Picker
1/27/2009 19
20. Observers per Experiment by Year
10000
1000
Log of the Number Observers
These
should also
have error
bars and
100
connecting
lines…
10
1
1990 1995 2000 2005 2010
Experiment by Year
1/27/2009 20
21. Consideration 2: Distributed Design
• Minimize the effort from any single participant
− Increase volunteer participation rate?
− Minimize impact of an single, systematically disruptive
participant
•A ‘knob’ that can be used to dial the target “time to
completion” for any given web participant
• Applicable to even relatively complex tasks
− Triadic comparison
vs.
1/27/2009 21
22. Consideration 3: Ambiguity
• Lack of constraints is a trade-off
− May make the task more difficult for observers
− May enable a different set of questions
− General bias is towards unconstrained tasks
− Implicitly include real world variability
• Sourcesof variability are vast, robustness comes
from scale – and a focus categories not thresholds
“wasn’t sure whether you wanted
accurate or poetic names.”
Anonymous Comment
June 8, 2002
1/27/2009 22
23. Consideration 4: Hypotheses vs Training
• Thresholds versus Categories
• Individual performance versus collective capability
• Numbers versus Words
Pixel by pixel
machine color
naming – see -
‘Lexical Image
Processing’
CIC 16
1/27/2009 23
24. Consideration 5: Simplicity
• In both tasks and tools
• The simpler the task – likely the less confusion over
instructions, higher the volunteer participation rate
• The simpler the tools – lowest common denominator
infrastructure, minimum number of versions over the
years, likely widest audience
1/27/2009 24
25. Consideration 6: Global & Open-Ended
Global scale for participation
•
Effort is front loaded - once uploaded no
•
real penalty to indefinite data collection
Data ‘evolves’ as it changes scale
•
Especially true for
•
− inter-related experiments,
10000
− variations in experimental designs and 1000
Log of the Number Observers
− results that are in pursuit of an aggregate
property 100
− results that change over time
10
1
1990 1995 2000 2005 2010
Experiment by Year
1/27/2009 25
26. Consideration 7: Usage as Data
• Any online interaction creates data
• Theboundary between experiments and tools is
potentially fuzzy
• Usefulexperiments can be formatted as a useful
tool, and the more useful the tool the greater the
potential data.
• An important implication and possible advantage is
that a tool defines context for the task, the
pragmatics is inherent.
1/27/2009 26
27. Consideration 8: Mutual Bootstrapping
Mutual bootstrapping – machine learning applied to training
•
data gathered online, which in turn creates processed data
which can enable human learning.
Social data can be educational.
•
Chartreuse
Revisiting approaches to laboratory experiments – if the
•
goals are simplicity, categorization, ambiguity, larger scale
and so on, how are the designs different?
1/27/2009 27
28. Questions?
Elvis’s favorite color?
That would be blue.
1/27/2009 28