SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Truth 
is 
a 
Lie 
CrowdTruth: 
The 
7 
Myths 
of 
Human 
Annota9on 
Lora 
Aroyo
Human 
annota9on 
of 
seman)c 
interpreta)on 
tasks 
as 
cri)cal 
part 
of 
cogni)ve 
systems 
engineering 
– standard 
prac)ce 
based 
on 
an9quated 
ideal 
of 
a 
single 
correct 
truth 
– 7 
myths 
of 
human 
annota)on 
– new 
theory 
of 
truth: 
CrowdTruth 
Take 
Home 
Message 
Lora Aroyo
I 
amar 
prestar 
aen... 
• amount 
of 
data 
& 
scale 
of 
computa9on 
available 
have 
increased 
by 
a 
previously 
inconceivable 
amount 
• CS 
& 
AI 
moved 
out 
of 
thought 
problems 
to 
empirical 
science 
• current 
methods 
pre-­‐date 
this 
fundamental 
shi? 
• the 
ideal 
of 
“one 
truth” 
is 
a 
lie 
• crowdsourcing 
& 
seman9cs 
together 
correct 
the 
fallacy 
and 
improve 
analy)c 
systems 
The 
world 
has 
changed: 
there 
is 
a 
need 
to 
form 
a 
new 
theory 
of 
truth 
-­‐ 
appropriate 
to 
cogni)ve 
systems 
Lora Aroyo
Seman)c 
interpreta)on 
is 
needed 
in 
all 
sciences 
– Data 
abstracted 
into 
categories 
– PaIerns, 
correla9ons, 
associa9ons 
& 
implica9ons 
are 
extracted 
Seman9c 
Interpreta9on 
Cogni9ve 
Compu9ng: 
providing 
some 
way 
of 
scalable 
seman)c 
interpreta)on 
Lora Aroyo
• Humans 
analyze 
examples: 
annota)ons 
for 
ground 
truth 
= 
the 
correct 
output 
for 
each 
example 
• Machines 
learn 
from 
the 
examples 
• Ground 
Truth 
Quality: 
– measured 
by 
inter-­‐annotator 
agreement 
– founded 
on 
ideal 
for 
single, 
universally 
constant 
truth 
– high 
agreement 
= 
high 
quality 
– disagreement 
must 
be 
eliminated 
Tradi9onal 
Human 
Annota9on 
Lora Aroyo 
Current 
gold 
standard 
acquisi9on 
& 
quality 
evalua9on 
are 
outdated
• Cogni)ve 
Compu)ng 
increases 
the 
need 
for 
machines 
to 
handle 
the 
scale 
of 
data 
• Results 
in 
increasing 
need 
for 
new 
gold 
standards 
able 
to 
measure 
machine 
performance 
on 
tasks 
that 
require 
seman)c 
interpreta)on 
Need 
for 
Change 
Lora Aroyo 
The 
New 
Ground 
Truth 
is 
CrowdTruth
• One 
truth: 
data 
collec)on 
efforts 
assume 
one 
correct 
interpreta)on 
for 
every 
example 
• All 
examples 
are 
created 
equal: 
ground 
truth 
treats 
all 
examples 
the 
same 
– 
either 
match 
the 
correct 
result 
or 
not 
• Detailed 
guidelines 
help: 
if 
examples 
cause 
disagreement 
-­‐ 
add 
instruc)ons 
to 
limit 
interpreta)ons 
• Disagreement 
is 
bad: 
increase 
quality 
of 
annota)on 
data 
by 
reducing 
disagreement 
among 
the 
annotators 
• One 
is 
enough: 
most 
of 
the 
annotated 
examples 
are 
evaluated 
by 
one 
person 
• Experts 
are 
beIer: 
annotators 
with 
domain 
knowledge 
provide 
beIer 
annota)ons 
• Once 
done, 
forever 
valid: 
annota)ons 
are 
not 
updated; 
new 
data 
not 
aligned 
with 
previous 
7 
Myths 
myths 
directly 
influence 
the 
prac)ce 
of 
collec)ng 
human 
annotated 
data; 
Need 
to 
be 
revisited 
in 
the 
context 
of 
new 
changing 
world 
& 
in 
the 
face 
of 
a 
new 
theory 
of 
truth 
(CrowdTruth) 
Lora Aroyo
current 
ground 
truth 
collec)on 
efforts 
assume 
one 
correct 
interpreta)on 
for 
every 
example 
the 
ideal 
of 
truth 
is 
a 
fallacy 
for 
seman9c 
interpreta9on 
and 
needs 
to 
be 
changed 
1. 
One 
Truth 
What 
if 
there 
are 
MORE? 
Lora Aroyo
Which is the mood most appropriate 
Cluster 
1 
Cluster 
2 
Cluster 
3 
Cluster 
4 
Cluster 
5 
Other 
passionate, 
rollicking, 
literate, 
humorous, 
silly, 
aggressive, 
fiery, 
does 
not 
fit 
into 
rousing, 
cheerful, 
fun, 
poignant, 
wis9ul, 
campy, 
quirky, 
tense, 
anxious, 
any 
of 
the 
5 
confident, 
sweet, 
amiable, 
bi>ersweet, 
whimsical, 
wi>y, 
intense, 
vola?le, 
clusters 
boisterous, 
good-­‐natured 
autumnal, 
wry 
visceral 
rowdy 
brooding 
Lora Aroyo 
Choose 
one: 
for each song? 
one 
truth? 
Results 
in: 
(Lee 
and 
Hu 
2012)
• typically 
annotators 
are 
asked 
whether 
a 
binary 
property 
holds 
for 
each 
example 
• o?en 
not 
given 
a 
chance 
to 
say 
that 
the 
property 
may 
par9ally 
hold, 
or 
holds 
but 
is 
not 
clearly 
expressed 
• mathema9cs 
of 
using 
ground 
truth 
treats 
every 
example 
the 
same 
– 
either 
match 
correct 
result 
or 
not 
• poor 
quality 
examples 
tend 
to 
generate 
high 
disagreement 
disagreement 
allows 
us 
to 
weight 
sentences 
= 
the 
ability 
to 
train 
& 
evaluate 
a 
machine 
more 
flexibly 
2. 
All 
Examples 
Are 
Created 
Equal 
What 
if 
they 
are 
DIFFERENT? 
Lora Aroyo
Is TREAT relation expressed between 
the highlighted terms? 
ANTIBIOTICS are the first line treatment for indications of 
TYPHUS. 
clearly 
With ANTIBIOTICS in short supply, DDT was used during World 
War II to control the insect vectors of TYPHUS. 
treats 
less 
clear 
treats 
equal 
training 
data? 
disagreement 
can 
indicate 
vagueness 
& 
ambiguity 
of 
sentences 
Lora Aroyo
• Perfuming 
agreement 
scores 
by 
forcing 
annotators 
to 
make 
choices 
they 
may 
think 
are 
not 
valid 
• Low 
annotator 
agreement 
is 
addressed 
by 
detailed 
guidelines 
for 
annotators 
to 
consistently 
handle 
the 
cases 
that 
generate 
disagreement 
• Remove 
poten9al 
signal 
on 
examples 
that 
are 
ambiguous 
precise 
annota)on 
guidelines 
do 
eliminate 
disagreement 
but 
do 
not 
increase 
quality 
3. 
Detailed 
Guidelines 
Help 
What 
if 
they 
HURT? 
Lora Aroyo
Which mood cluster is 
most appropriate for a song? 
Instruc9ons 
Your 
task 
is 
to 
listen 
to 
the 
following 
30 
second 
music 
clips 
and 
select 
disagreement 
can 
indicate 
problems 
with 
the 
task 
the 
most 
appropriate 
mood 
cluster 
that 
represents 
the 
mood 
of 
the 
music. 
Try 
to 
think 
about 
the 
mood 
carried 
by 
the 
music 
and 
please 
try 
to 
ignore 
any 
lyrics. 
If 
you 
feel 
the 
music 
does 
not 
fit 
into 
any 
of 
the 
5 
clusters 
please 
select 
“Other”. 
The 
descrip)ons 
of 
the 
clusters 
are 
provided 
in 
the 
panel 
at 
the 
top 
of 
the 
page 
for 
your 
reference. 
Answer 
the 
ques)ons 
carefully. 
Your 
work 
will 
not 
be 
accepted 
if 
your 
answers 
are 
inconsistent 
and/or 
incomplete. 
restric2ng 
guidelines 
help? 
(Lee 
and 
Hu 
2012) 
Lora Aroyo
• rather 
than 
accep)ng 
disagreement 
as 
a 
natural 
property 
of 
seman)c 
interpreta)on 
• tradi)onally, 
disagreement 
is 
considered 
a 
measure 
of 
poor 
quality 
because: 
– task 
is 
poorly 
defined 
or 
– annotators 
lack 
training 
this 
makes 
the 
elimina9on 
of 
disagreement 
the 
GOAL 
4. 
Disagreement 
is 
Bad 
What 
if 
it 
is 
GOOD? 
Lora Aroyo
Does each sentence express 
the TREAT relation? 
ANTIBIOTICS are the first line treatment for indications of TYPHUS. 
à agreement 95% 
Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects. 
à agreement 80% 
With ANTIBIOTICS in short supply, DDT was used during WWII to control 
the insect vectors of TYPHUS. 
à agreement 50% 
disagreement 
bad? 
disagreement 
can 
reflect 
the 
degree 
of 
clarity 
in 
a 
sentence 
Lora Aroyo
• over 
90% 
of 
annotated 
examples 
– 
seen 
by 
1-­‐2 
annotators 
• small 
number 
overlap 
– 
to 
measure 
agreement 
five 
or 
six 
popular 
interpreta9ons 
can’t 
be 
captured 
by 
one 
or 
two 
people 
5. 
One 
is 
Enough 
What 
if 
it 
is 
NOT 
ENOUGH? 
Lora Aroyo
One 
Quality? 
accumulated 
results 
for 
each 
rela)on 
across 
all 
the 
sentences 
20 
workers/sentence 
(and 
higher) 
yields 
same 
rela9ve 
disagreement 
Lora Aroyo
• conven9onal 
wisdom: 
human 
annotators 
with 
domain 
knowledge 
provide 
beIer 
annotated 
data, 
e.g 
– medical 
texts 
should 
be 
annotated 
by 
medical 
experts 
• but 
experts 
are 
expensive 
& 
don’t 
scale 
mul9ple 
perspec9ves 
on 
data 
can 
be 
useful, 
beyond 
what 
experts 
believe 
is 
salient 
or 
correct 
6. 
Experts 
Are 
BeIer 
What 
if 
the 
CROWD 
IS 
BETTER? 
Lora Aroyo
What is the (medical) relation between 
the highlighted (medical) terms? 
• 91% of expert annotations covered by the crowd 
• expert annotators reach agreement only in 30% 
• most popular crowd vote covers 95% of this 
expert annotation agreement 
experts 
beIer 
than 
crowd? 
Lora Aroyo
• perspec9ves 
change 
over 
9me 
– 
old 
training 
data 
might 
contain 
examples 
that 
are 
not 
valid 
or 
only 
par)ally 
valid 
later 
• con9nuous 
collec9on 
of 
training 
data 
over 
)me 
allows 
the 
adapta)on 
of 
gold 
standards 
to 
changing 
)mes 
– popularity 
of 
music 
– levels 
of 
educa)on 
7. 
Once 
Done, 
Forever 
Valid 
What 
if 
VALIDITY 
CHANGES?
Which are mentions of terrorists 
in this sentence? 
OSAMA 
BIN 
LADEN used money from his own 
construction company to support the MUHAJADEEN in 
Afghanistan against Soviet forces. 
forever 
valid? 
1990: 
hero 
2011: 
terrorist 
both 
types 
should 
be 
valid 
-­‐ 
two 
roles 
for 
same 
en9ty 
-­‐ 
adapta9on 
of 
gold 
standards 
to 
changing 
9mes 
Lora Aroyo
crowdtruth.org 
Jean-­‐Marc 
Côté, 
1899
• annotator disagreement is signal, not noise. 
• it is indicative of the variation in human 
semantic interpretation of signs 
• it can indicate ambiguity, vagueness, 
similarity, over-generality, as well as quality 
crowdtruth.org
crowdtruth.org
The 
Team 
2013 
hIp://crowd-­‐watson.nl
The Crew 2014
The 
(almost 
complete) 
Team 
2014
lora-aroyo.org 
slideshare.com/laroyo 
@laroyo 
crowdtruth.org

Más contenido relacionado

La actualidad más candente

My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
 
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...Lora Aroyo
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionLora Aroyo
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Lora Aroyo
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumFuming Shih
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Diana Maynard
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Ke Tao
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Fake News Detector
Fake News DetectorFake News Detector
Fake News DetectorIrisYoon5
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentSandy Man
 
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...UXPA International
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real worldDiana Maynard
 

La actualidad más candente (15)

My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
 
Pydata Taipei 2020
Pydata Taipei 2020Pydata Taipei 2020
Pydata Taipei 2020
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth Serum
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Fake News Detector
Fake News DetectorFake News Detector
Fake News Detector
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deployment
 
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...
Mechanical Turk Demystified: Best practices for sourcing and scaling quality ...
 
Factors Influencing Customers’ Intention to Use Instant Messaging to Communic...
Factors Influencing Customers’ Intention to Use Instant Messaging to Communic...Factors Influencing Customers’ Intention to Use Instant Messaging to Communic...
Factors Influencing Customers’ Intention to Use Instant Messaging to Communic...
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 

Destacado

Towards Better Media Understanding and Searchability
Towards Better Media Understanding and SearchabilityTowards Better Media Understanding and Searchability
Towards Better Media Understanding and Searchabilityoanainel
 
Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015CrowdTruth
 
Dive+@ICTOpen2017
Dive+@ICTOpen2017Dive+@ICTOpen2017
Dive+@ICTOpen2017oanainel
 
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Lora Aroyo
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataCrowdTruth
 
Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?CrowdTruth
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsBenjamin Timmermans
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...CrowdTruth
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation Victor de Boer
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015Lora Aroyo
 
Boosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through CrowdsourcingBoosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through Crowdsourcingoanainel
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...PRELIDA Project
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extractionoanainel
 
Closing Event - Watson Innovation Course
Closing Event - Watson Innovation CourseClosing Event - Watson Innovation Course
Closing Event - Watson Innovation CourseLora Aroyo
 
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons with DIVE+
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons  with DIVE+DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons  with DIVE+
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons with DIVE+Lora Aroyo
 
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumStitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumLora Aroyo
 

Destacado (20)

Towards Better Media Understanding and Searchability
Towards Better Media Understanding and SearchabilityTowards Better Media Understanding and Searchability
Towards Better Media Understanding and Searchability
 
Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015
 
Dive+@ICTOpen2017
Dive+@ICTOpen2017Dive+@ICTOpen2017
Dive+@ICTOpen2017
 
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
 
Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain Questions
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015
 
Boosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through CrowdsourcingBoosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through Crowdsourcing
 
Kick-off meeting Linkflows project
Kick-off meeting Linkflows projectKick-off meeting Linkflows project
Kick-off meeting Linkflows project
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
CEDAR: From Fragment to Fabric - Dutch Census Data in a Web of Global Cultura...
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extraction
 
Closing Event - Watson Innovation Course
Closing Event - Watson Innovation CourseClosing Event - Watson Innovation Course
Closing Event - Watson Innovation Course
 
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons with DIVE+
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons  with DIVE+DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons  with DIVE+
DIVE+ @ NLeSymposium 2015: Towards New Cultural Commons with DIVE+
 
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumStitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
 

Similar a Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docx
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docxSoc 156 – Sociology of CommunicationReview Sheet – FinalShor.docx
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docxwhitneyleman54422
 
anchoring-heuristic Decision Making
anchoring-heuristic Decision Makinganchoring-heuristic Decision Making
anchoring-heuristic Decision MakingÖzkan Özer
 
Teaching lean startup capital enterprise
Teaching lean startup   capital enterpriseTeaching lean startup   capital enterprise
Teaching lean startup capital enterpriseFounder-Centric
 
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step ReasoningHARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoningdyyjkd
 
psychology in media by mostafa ewees
psychology in media by mostafa eweespsychology in media by mostafa ewees
psychology in media by mostafa eweesMostafa Ewees
 
Gender and language (linguistics, social network theory, Twitter!)
Gender and language (linguistics, social network theory, Twitter!)Gender and language (linguistics, social network theory, Twitter!)
Gender and language (linguistics, social network theory, Twitter!)Tyler Schnoebelen
 
Gender, language, and Twitter: Social theory and computational methods
Gender, language, and Twitter: Social theory and computational methodsGender, language, and Twitter: Social theory and computational methods
Gender, language, and Twitter: Social theory and computational methodsIdibon1
 
Revising lftvd
Revising lftvdRevising lftvd
Revising lftvdTomEccles4
 
Dialogue based Meaning Negotiation
Dialogue based Meaning NegotiationDialogue based Meaning Negotiation
Dialogue based Meaning NegotiationTerry Payne
 
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015Lora Aroyo
 
Connective Media Technologies - A Look Into Reddit's Star Dish
Connective Media Technologies - A Look Into Reddit's Star DishConnective Media Technologies - A Look Into Reddit's Star Dish
Connective Media Technologies - A Look Into Reddit's Star DishFrances Coronel
 
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureII-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureDr. Haxel Consult
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Mustafa Ekim
 
ELEC 161 QUIZ 2 All the questions carry same weight .docx
ELEC 161 QUIZ 2  All the questions carry same weight .docxELEC 161 QUIZ 2  All the questions carry same weight .docx
ELEC 161 QUIZ 2 All the questions carry same weight .docxjack60216
 
Rinse and Repeat : The Spiral of Applied Machine Learning
Rinse and Repeat : The Spiral of Applied Machine LearningRinse and Repeat : The Spiral of Applied Machine Learning
Rinse and Repeat : The Spiral of Applied Machine LearningAnna Chaney
 

Similar a Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014 (20)

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docx
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docxSoc 156 – Sociology of CommunicationReview Sheet – FinalShor.docx
Soc 156 – Sociology of CommunicationReview Sheet – FinalShor.docx
 
anchoring-heuristic Decision Making
anchoring-heuristic Decision Makinganchoring-heuristic Decision Making
anchoring-heuristic Decision Making
 
Teaching lean startup capital enterprise
Teaching lean startup   capital enterpriseTeaching lean startup   capital enterprise
Teaching lean startup capital enterprise
 
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step ReasoningHARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
 
psychology in media by mostafa ewees
psychology in media by mostafa eweespsychology in media by mostafa ewees
psychology in media by mostafa ewees
 
XAI-proposal2.pptx
XAI-proposal2.pptxXAI-proposal2.pptx
XAI-proposal2.pptx
 
Gender and language (linguistics, social network theory, Twitter!)
Gender and language (linguistics, social network theory, Twitter!)Gender and language (linguistics, social network theory, Twitter!)
Gender and language (linguistics, social network theory, Twitter!)
 
Gender, language, and Twitter: Social theory and computational methods
Gender, language, and Twitter: Social theory and computational methodsGender, language, and Twitter: Social theory and computational methods
Gender, language, and Twitter: Social theory and computational methods
 
Revising lftvd
Revising lftvdRevising lftvd
Revising lftvd
 
Dialogue based Meaning Negotiation
Dialogue based Meaning NegotiationDialogue based Meaning Negotiation
Dialogue based Meaning Negotiation
 
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015#CrowdTruth: Linked Data for Information Extraction @ISWC2015
#CrowdTruth: Linked Data for Information Extraction @ISWC2015
 
QA is Broken, Fix it!
QA is Broken, Fix it!QA is Broken, Fix it!
QA is Broken, Fix it!
 
Connective Media Technologies - A Look Into Reddit's Star Dish
Connective Media Technologies - A Look Into Reddit's Star DishConnective Media Technologies - A Look Into Reddit's Star Dish
Connective Media Technologies - A Look Into Reddit's Star Dish
 
Mind the Semantic Gap
Mind the Semantic GapMind the Semantic Gap
Mind the Semantic Gap
 
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureII-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
 
ELEC 161 QUIZ 2 All the questions carry same weight .docx
ELEC 161 QUIZ 2  All the questions carry same weight .docxELEC 161 QUIZ 2  All the questions carry same weight .docx
ELEC 161 QUIZ 2 All the questions carry same weight .docx
 
Rinse and Repeat : The Spiral of Applied Machine Learning
Rinse and Repeat : The Spiral of Applied Machine LearningRinse and Repeat : The Spiral of Applied Machine Learning
Rinse and Repeat : The Spiral of Applied Machine Learning
 

Más de Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningLora Aroyo
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Lora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AILora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorLora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataLora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsLora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesLora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the LoopLora Aroyo
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...Lora Aroyo
 
Keynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventKeynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventLora Aroyo
 

Más de Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...
 
Keynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventKeynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program Event
 

Último

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Último (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014

  • 1. Truth is a Lie CrowdTruth: The 7 Myths of Human Annota9on Lora Aroyo
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. Human annota9on of seman)c interpreta)on tasks as cri)cal part of cogni)ve systems engineering – standard prac)ce based on an9quated ideal of a single correct truth – 7 myths of human annota)on – new theory of truth: CrowdTruth Take Home Message Lora Aroyo
  • 7. I amar prestar aen... • amount of data & scale of computa9on available have increased by a previously inconceivable amount • CS & AI moved out of thought problems to empirical science • current methods pre-­‐date this fundamental shi? • the ideal of “one truth” is a lie • crowdsourcing & seman9cs together correct the fallacy and improve analy)c systems The world has changed: there is a need to form a new theory of truth -­‐ appropriate to cogni)ve systems Lora Aroyo
  • 8. Seman)c interpreta)on is needed in all sciences – Data abstracted into categories – PaIerns, correla9ons, associa9ons & implica9ons are extracted Seman9c Interpreta9on Cogni9ve Compu9ng: providing some way of scalable seman)c interpreta)on Lora Aroyo
  • 9. • Humans analyze examples: annota)ons for ground truth = the correct output for each example • Machines learn from the examples • Ground Truth Quality: – measured by inter-­‐annotator agreement – founded on ideal for single, universally constant truth – high agreement = high quality – disagreement must be eliminated Tradi9onal Human Annota9on Lora Aroyo Current gold standard acquisi9on & quality evalua9on are outdated
  • 10. • Cogni)ve Compu)ng increases the need for machines to handle the scale of data • Results in increasing need for new gold standards able to measure machine performance on tasks that require seman)c interpreta)on Need for Change Lora Aroyo The New Ground Truth is CrowdTruth
  • 11. • One truth: data collec)on efforts assume one correct interpreta)on for every example • All examples are created equal: ground truth treats all examples the same – either match the correct result or not • Detailed guidelines help: if examples cause disagreement -­‐ add instruc)ons to limit interpreta)ons • Disagreement is bad: increase quality of annota)on data by reducing disagreement among the annotators • One is enough: most of the annotated examples are evaluated by one person • Experts are beIer: annotators with domain knowledge provide beIer annota)ons • Once done, forever valid: annota)ons are not updated; new data not aligned with previous 7 Myths myths directly influence the prac)ce of collec)ng human annotated data; Need to be revisited in the context of new changing world & in the face of a new theory of truth (CrowdTruth) Lora Aroyo
  • 12. current ground truth collec)on efforts assume one correct interpreta)on for every example the ideal of truth is a fallacy for seman9c interpreta9on and needs to be changed 1. One Truth What if there are MORE? Lora Aroyo
  • 13. Which is the mood most appropriate Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Other passionate, rollicking, literate, humorous, silly, aggressive, fiery, does not fit into rousing, cheerful, fun, poignant, wis9ul, campy, quirky, tense, anxious, any of the 5 confident, sweet, amiable, bi>ersweet, whimsical, wi>y, intense, vola?le, clusters boisterous, good-­‐natured autumnal, wry visceral rowdy brooding Lora Aroyo Choose one: for each song? one truth? Results in: (Lee and Hu 2012)
  • 14. • typically annotators are asked whether a binary property holds for each example • o?en not given a chance to say that the property may par9ally hold, or holds but is not clearly expressed • mathema9cs of using ground truth treats every example the same – either match correct result or not • poor quality examples tend to generate high disagreement disagreement allows us to weight sentences = the ability to train & evaluate a machine more flexibly 2. All Examples Are Created Equal What if they are DIFFERENT? Lora Aroyo
  • 15. Is TREAT relation expressed between the highlighted terms? ANTIBIOTICS are the first line treatment for indications of TYPHUS. clearly With ANTIBIOTICS in short supply, DDT was used during World War II to control the insect vectors of TYPHUS. treats less clear treats equal training data? disagreement can indicate vagueness & ambiguity of sentences Lora Aroyo
  • 16. • Perfuming agreement scores by forcing annotators to make choices they may think are not valid • Low annotator agreement is addressed by detailed guidelines for annotators to consistently handle the cases that generate disagreement • Remove poten9al signal on examples that are ambiguous precise annota)on guidelines do eliminate disagreement but do not increase quality 3. Detailed Guidelines Help What if they HURT? Lora Aroyo
  • 17. Which mood cluster is most appropriate for a song? Instruc9ons Your task is to listen to the following 30 second music clips and select disagreement can indicate problems with the task the most appropriate mood cluster that represents the mood of the music. Try to think about the mood carried by the music and please try to ignore any lyrics. If you feel the music does not fit into any of the 5 clusters please select “Other”. The descrip)ons of the clusters are provided in the panel at the top of the page for your reference. Answer the ques)ons carefully. Your work will not be accepted if your answers are inconsistent and/or incomplete. restric2ng guidelines help? (Lee and Hu 2012) Lora Aroyo
  • 18. • rather than accep)ng disagreement as a natural property of seman)c interpreta)on • tradi)onally, disagreement is considered a measure of poor quality because: – task is poorly defined or – annotators lack training this makes the elimina9on of disagreement the GOAL 4. Disagreement is Bad What if it is GOOD? Lora Aroyo
  • 19. Does each sentence express the TREAT relation? ANTIBIOTICS are the first line treatment for indications of TYPHUS. à agreement 95% Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects. à agreement 80% With ANTIBIOTICS in short supply, DDT was used during WWII to control the insect vectors of TYPHUS. à agreement 50% disagreement bad? disagreement can reflect the degree of clarity in a sentence Lora Aroyo
  • 20. • over 90% of annotated examples – seen by 1-­‐2 annotators • small number overlap – to measure agreement five or six popular interpreta9ons can’t be captured by one or two people 5. One is Enough What if it is NOT ENOUGH? Lora Aroyo
  • 21. One Quality? accumulated results for each rela)on across all the sentences 20 workers/sentence (and higher) yields same rela9ve disagreement Lora Aroyo
  • 22. • conven9onal wisdom: human annotators with domain knowledge provide beIer annotated data, e.g – medical texts should be annotated by medical experts • but experts are expensive & don’t scale mul9ple perspec9ves on data can be useful, beyond what experts believe is salient or correct 6. Experts Are BeIer What if the CROWD IS BETTER? Lora Aroyo
  • 23. What is the (medical) relation between the highlighted (medical) terms? • 91% of expert annotations covered by the crowd • expert annotators reach agreement only in 30% • most popular crowd vote covers 95% of this expert annotation agreement experts beIer than crowd? Lora Aroyo
  • 24. • perspec9ves change over 9me – old training data might contain examples that are not valid or only par)ally valid later • con9nuous collec9on of training data over )me allows the adapta)on of gold standards to changing )mes – popularity of music – levels of educa)on 7. Once Done, Forever Valid What if VALIDITY CHANGES?
  • 25. Which are mentions of terrorists in this sentence? OSAMA BIN LADEN used money from his own construction company to support the MUHAJADEEN in Afghanistan against Soviet forces. forever valid? 1990: hero 2011: terrorist both types should be valid -­‐ two roles for same en9ty -­‐ adapta9on of gold standards to changing 9mes Lora Aroyo
  • 27. • annotator disagreement is signal, not noise. • it is indicative of the variation in human semantic interpretation of signs • it can indicate ambiguity, vagueness, similarity, over-generality, as well as quality crowdtruth.org
  • 29.
  • 30. The Team 2013 hIp://crowd-­‐watson.nl
  • 31.