SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Amazon 
Mechanical 
Turk 
Hands-­‐on 
session 
Maribel 
Acosta
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
2
MTurk 
Basic 
Concepts 
(1) 
Requester 
Worker 
Source: 
h?ps://requester.mturk.com/tour/how_it_works 
• Requester: 
creates 
and 
submits 
tasks 
to 
the 
pla9orm. 
• Worker: 
person 
who 
solves 
the 
tasks. 
• Human 
Intelligence 
Task 
(HIT): 
work 
unit. 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
3
MTurk 
Basic 
Concepts 
HIT 
(2) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
• Project: 
HIT 
HTML 
+ 
HIT 
metadata 
• The 
elements 
that 
stay 
the 
same 
in 
every 
HIT 
are 
denominated 
template 
• The 
data 
that 
will 
vary 
from 
HIT 
to 
HIT 
are 
specified 
via 
variables 
• NOTE: 
If 
no 
variables 
are 
specified 
in 
the 
project, 
we 
will 
create 
a 
single 
HIT 
• Variables: 
allow 
creaQng 
several 
HITs 
in 
the 
project 
4 
HIT 
4 
3 
HIT 
2 
HIT 
1
MTurk 
Basic 
Concepts 
(3) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
• Batch: 
Group 
of 
HITs 
created 
by 
instanQaQng 
the 
variable(s) 
of 
a 
project 
• The 
values 
of 
the 
variables 
are 
specified 
in 
(CSV, 
TSV) 
files: 
• Each 
column 
corresponds 
to 
a 
variable 
• Each 
row 
is 
an 
instance 
-­‐> 
HIT 
• Each 
file 
corresponds 
to 
a 
batch 
• We 
can 
create 
several 
batches 
for 
the 
same 
project 
5
MTurk 
Basic 
Concepts 
(4) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
HIT 
1 
HIT 
HIT 
1 
Assig.1 
HIT 
1 
Assig.2 
Assignments 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
Q1 
Q2 
QuesCons 
• HIT: 
HIT 
1 
Assig.3 
Work 
unit. 
The 
same 
HIT 
can 
be 
solved 
by 
1 
or 
more 
workers 
(assignments) 
• Assignment: 
How 
many 
workers 
should 
solve 
one 
exact 
same 
HIT 
• QuesCons: 
A 
single 
HIT 
may 
contain 
one 
or 
several 
quesQons 
6
MTurk 
Basic 
Concepts 
(5) 
HIT 
4 
HIT 
3 
HIT 
2 
HIT 
1 
Batch 
HIT 
1 
HIT 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
Q1 
Q2 
QuesCons 
HIT 
1 
Assig.1 
HIT 
1 
Assig.2 
HI 
T1 
Assig.3 
Assignments 
Total 
cost 
of 
the 
project 
= 
No. 
of 
HITs 
x 
No. 
of 
Assignments 
x 
(Reward 
per 
HIT 
+ 
Fee) 
7
MTurk 
Basic 
Concepts 
(6) 
Example 
of 
Human 
Intelligence 
Tasks 
(HITs) 
• Projects 
can 
be 
broken 
into 
smaller 
tasks 
called 
HITs 
• A 
HIT 
represents 
a 
single 
work 
unit 
Tagging 
(describing) 
900 
images 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project: 
Create 
tags 
for 
image 
X1 
HIT: 
Create 
tags 
for 
image 
X2 
HIT: 
… 
No. 
of 
HITS 
= 
900 
8
MTurk 
Basic 
Concepts 
(7) 
Example 
of 
Human 
Intelligence 
Tasks 
(HITs) 
• Projects 
can 
be 
broken 
into 
smaller 
tasks 
called 
HITs 
• A 
HIT 
represents 
a 
single 
work 
unit 
Tagging 
(describing) 
900 
images 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project: 
Create 
tags 
for 
images 
X1, 
X2, 
X3 
HIT: 
Create 
tags 
for 
image 
X4, 
X5, 
X6 
HIT: 
… 
Several 
quesQons 
in 
a 
single 
HIT! 
No. 
of 
HITS 
= 
300 
9
MTurk 
Basic 
Concepts 
(8) 
When 
creaQng 
a 
project 
or 
individual 
HITs, 
the 
HIT 
properCes 
must 
be 
specified: 
• General 
informaQon: 
includes 
the 
Qtle 
and 
descripQon 
of 
the 
HIT, 
as 
well 
as 
keywords 
which 
are 
used 
by 
worker 
for 
searching 
HITs 
• HIT 
duraQon 
Qme: 
Qme 
allo?ed 
to 
solve 
the 
HIT 
(before 
it 
is 
given 
to 
another 
worker) 
• HIT 
life 
Qme: 
how 
long 
will 
the 
HIT 
be 
available 
on 
the 
pla9orm 
• # 
Assignments: 
number 
of 
different 
persons 
that 
will 
perform 
the 
exact 
same 
HIT 
• Reward: 
payment 
for 
correctly 
solving 
each 
assignment 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
10
MTurk 
Workflow 
for 
Requesters 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Project 
CreaCon 
& 
Design 
HIT 
Test 
(Sandbox) 
HIT 
PublicaQon 
Workers 
solve 
the 
HITs 
Review 
of 
the 
results 
Completed 
project 
ProducCon 
site 
11 
reject 
all 
assignments 
accepted
MTurk 
Sandbox 
The 
Sandbox 
is 
a 
simulated 
MTurk 
environment 
to 
test 
HITs. 
• Log 
in 
as 
requester: 
preview 
and 
test 
the 
interface 
of 
your 
HITs 
– h?ps://requestersandbox.mturk.com 
• Log 
in 
as 
worker: 
solve 
your 
own 
HITs 
to 
test 
their 
funcQonaliQes 
and 
result 
output 
– 
h?ps://workersandbox.mturk.com 
• Best 
pracCce: 
Always 
test 
your 
HITs 
(as 
requester 
and 
worker) 
before 
publishing 
them 
in 
the 
producQon 
site 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
12
Managing 
HITs 
in 
MTurk 
There 
are 
three 
different 
mechanism 
to 
manage 
your 
HITs 
in 
MTurk: 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
API 
Command 
Line 
Tools 
Web 
Interface 
13
MTURK 
WEB 
INTERFACE 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
14
Hands 
On! 
• Project: 
Crowdsourcing 
DBpedia 
triples 
to 
verify 
the 
links 
to 
external 
web 
pages 
prefix 
dbpedia-­‐ont:<http://dbpedia.org/ontology/> 
prefix 
foaf:<http://xmlns.com/foaf/0.1/> 
SELECT 
* 
WHERE 
{ 
?s 
dbpedia-­‐ont:wikiPageExternalLink 
?o; 
foaf:name 
?s_name; 
foaf:isPrimaryTopicOf 
?s_wikipage 
. 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
15 
} 
LIMIT 
200 
Triple 
to 
crowdsource 
Triples 
to 
build 
the 
UI 
http://dbpedia.org/sparql 
MTurkDemo/data/sparql.csv
Hands 
On! 
• Go 
to 
Mturk 
Sandbox 
as 
a 
requester: 
– h?ps://requestersandbox.mturk.com/ 
• Click 
on 
Sign 
In 
– Email 
address: 
own_tp@gmx.li 
– Password: 
sourcrowd 
• Now 
we 
are 
at 
“home” 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
16
1. 
Creating 
a 
Project 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Different 
predefined 
templates: 
Select 
“other” 
17
2. 
Setting 
up 
the 
HIT 
Properties 
(1) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
HIT 
descripQon 
18
2. 
Setting 
up 
the 
HIT 
Properties 
(2) 
Very 
IMPORTANT: 
Set 
up 
quality 
mechanisms 
Masters 
are 
selected 
by 
default 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
HIT 
properQes 
19
3. 
Selecting 
Qualifications 
Worker 
requirements 
(filters) 
Very 
IMPORTANT: 
Set 
up 
quality 
mechanisms 
Masters 
are 
selected 
by 
default 
HIT 
properQes 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
20 
Masters 
expect 
higher 
rewards 
MTurk 
charges 
20% 
for 
masters
4. 
Defining 
the 
Task 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
WYSIWYG 
HTML 
editor 
21
4. 
Defining 
the 
Task 
(with 
Variables) 
Template: 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
elements 
that 
stay 
the 
same 
in 
every 
HIT 
Variables: 
data 
that 
will 
vary 
from 
HIT 
to 
HIT. 
Are 
denoted 
as 
follows: 
${var_name} 
22
5. 
Previewing 
the 
Template 
This 
is 
what 
the 
workers 
will 
see 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
23 
The 
variables 
will 
be 
replaced 
by 
the 
input 
data
6. 
Creating 
Batches 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
24
7. 
Previewing 
the 
HITs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Variables 
are 
replaced 
with 
the 
data 
from 
the 
input 
file 
25
8. 
Publishing 
the 
HITs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Summary 
of 
the 
project: 
• # 
of 
HITs 
• Rewards 
• Total 
payment 
• Account 
balance 
26
9. 
Retrieving 
the 
Results 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
27
MTURK 
SUMMARY 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
28
Project/HIT 
Creation 
& 
Design 
(1) 
• The 
requester 
is 
able 
to 
create 
projects 
or 
individual 
HITs 
• Build 
user-­‐friendly 
interfaces 
(using 
web 
technologies) 
• Then, 
the 
HIT 
properCes 
must 
be 
specified: 
– General 
informaQon: 
includes 
the 
Qtle 
and 
descripQon 
of 
the 
HIT, 
as 
well 
as 
keywords 
which 
are 
used 
by 
worker 
for 
searching 
HITs. 
– HIT 
duraQon 
Qme: 
Qme 
allo?ed 
to 
solve 
the 
HIT 
(before 
it 
is 
given 
to 
another 
worker). 
– HIT 
life 
Qme: 
how 
long 
will 
the 
HIT 
be 
available 
on 
the 
pla9orm. 
– # 
Assignments: 
number 
of 
different 
persons 
that 
will 
perform 
the 
same 
HIT. 
– Reward: 
payment 
for 
correctly 
solving 
each 
assignment. 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
29
Project/HIT 
Creation 
& 
Design 
(2) 
• SelecQon 
of 
MTurk 
quality 
control 
mechanisms: 
• These 
• High 
quality 
workers 
• Masters 
• Photo 
moderaQon 
masters 
• CategorizaQon 
masters 
• System 
qualificaCons 
• LocaQon 
by 
country 
• HIT 
submission 
rate 
(%) 
• HIT 
approval/rejecQon 
rate 
(%) 
• (Absolute) 
Number 
of 
HITs 
approved 
• QualificaCon 
types 
• Simply 
granted 
or 
a?ributed 
via 
customized 
tests 
filters 
are 
automaQcally 
performed 
by 
the 
pla9orm 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
Worker 
requirements 
Masters 
expect 
higher 
rewards 
MTurk 
charges 
20% 
for 
masters 
30
HIT 
Test 
• Best 
pracCce: 
Always 
test 
your 
HITs 
before 
publishing 
them 
1. Perform 
technical 
tests 
(both 
as 
requester 
and 
worker) 
in 
the 
MTurk 
Sandbox 
environment. 
Source: 
h?ps://requester.mturk.com/developer/sandbox 
2. Publish 
a 
small 
subset 
of 
tasks 
in 
the 
producQon 
site 
to 
test 
usability 
and 
responsiveness. 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
31
Run 
live 
HITs 
• HIT 
publicaCon: 
– Make 
the 
HITs 
available 
to 
the 
workers 
• Review 
the 
results: 
– Monitor 
the 
submi?ed 
assignments 
constantly 
– Download 
the 
results 
– Accept/reject 
assignments, 
provide 
feedback 
when 
rejecQng 
– Block 
spammers 
(opQonal) 
• Update 
HIT/Project: 
– Extend/expire 
HITs 
or 
modify 
other 
HIT 
properQes 
– Add 
addiQonal 
assignments 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
32
Lessons 
Learned 
• Introduce 
yourself 
on 
Worker 
forums 
(regular 
requester) 
• Be 
responsive 
to 
workers 
– Reply 
to 
emails 
with 
quesQons 
about 
tasks 
• Use 
monitoring 
tools: 
– Forums 
– TurkopQcon 
(Source:h?p://turkopQcon.differenceengines.com/)
Choosing 
the 
Right 
Tool 
Source: 
h?ps://requestersandbox.mturk.com/tour/choose_the_right_tool 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
34
IS 
THERE 
MORE 
THAN 
MTURK? 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
35
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
36
CrowdFlower 
Platform 
CrowdFlower 
• Client: 
CrowdFlower 
contributors 
Creates 
and 
submits 
jobs 
(MTurk 
= 
requester) 
• Contributor: 
person 
who 
solves 
the 
jobs 
(MTurk 
= 
worker) 
• Job: 
unit 
work 
(MTurk 
= 
task) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
37 
client
Why 
CrowdFlower? 
(1) 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
38 
Neat 
UI
Why 
CrowdFlower? 
(2) 
Quality 
control 
mechanisms 
Allows 
for 
easily 
creaQng 
a 
“gold 
standard”, 
which 
is 
further 
used 
to 
detect 
low 
quality 
workers 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
39
Why 
CrowdFlower? 
(3) 
Report 
generaCon 
and 
analyCcs 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
40
Why 
NOT 
CrowdFlower? 
• At 
the 
beginning, 
clients 
must 
wait 
unQl 
their 
projects 
are 
approved 
by 
the 
CrowdFlower 
staff 
before 
publishing 
them 
– Wait 
Qme: 
From 
a 
couple 
of 
hours 
up 
to 
(5)* 
days 
• Jobs 
must 
be 
specified 
in 
a 
non-­‐standard 
language: 
– CML: 
CrowdFlower 
Markup 
Language 
• There 
are 
certain 
configuraCons 
that 
cannot 
be 
executed 
in 
the 
pla9orm 
*Personal 
experience 
of 
the 
presenter 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
41
References 
• AMT. 
Geyng 
Started 
Guide. 
API 
Version 
2012-­‐03-­‐25 
h?p://s3.amazonaws.com/awsdocs/MechTurk/latest/amt-­‐gsg.pdf 
• The 
Mechanical 
Turk 
Blog 
h?p://mechanicalturk.typepad.com/ 
• MTurk 
Java 
API 
h?p://people.csail.mit.edu/gli?le/MTurkJavaAPI/ 
• CrowdFlower 
Pla9orm 
h?p://crowdflower.com 
Amazon 
Mechanical 
Turk 
hands-­‐on 
session 
42

Más contenido relacionado

Similar a Verify DBpedia Links Using MTurk

Hangman for the Masses Showcase of Web Tech
Hangman for the Masses Showcase of Web TechHangman for the Masses Showcase of Web Tech
Hangman for the Masses Showcase of Web TechOlmo F. Maldonado
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
 
Tips for Developing and Testing IBM HATS Applications
Tips for Developing and Testing IBM HATS ApplicationsTips for Developing and Testing IBM HATS Applications
Tips for Developing and Testing IBM HATS ApplicationsStrongback Consulting
 
Devday2016 real unittestingwithmockframework-phatvu
Devday2016 real unittestingwithmockframework-phatvuDevday2016 real unittestingwithmockframework-phatvu
Devday2016 real unittestingwithmockframework-phatvuPhat VU
 
Busy Developers Guide to AngularJS (Tiberiu Covaci)
Busy Developers Guide to AngularJS (Tiberiu Covaci)Busy Developers Guide to AngularJS (Tiberiu Covaci)
Busy Developers Guide to AngularJS (Tiberiu Covaci)ITCamp
 
Patterns for Scalability in Windows Azure Applications (Alex Mang)
Patterns for Scalability in Windows Azure Applications (Alex Mang)Patterns for Scalability in Windows Azure Applications (Alex Mang)
Patterns for Scalability in Windows Azure Applications (Alex Mang)ITCamp
 
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...DevDay.org
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt
 
Alexandra johnson reducing operational barriers to model training
Alexandra johnson   reducing operational barriers to model trainingAlexandra johnson   reducing operational barriers to model training
Alexandra johnson reducing operational barriers to model trainingMLconf
 
Get together on getting more out of typescript &amp; angular 2
Get together on getting more out of typescript &amp; angular 2Get together on getting more out of typescript &amp; angular 2
Get together on getting more out of typescript &amp; angular 2Ruben Haeck
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Etu Solution
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6Shah Zaib
 
Feedback on building Production-Ready Microsoft Teams Apps
Feedback on building Production-Ready Microsoft Teams AppsFeedback on building Production-Ready Microsoft Teams Apps
Feedback on building Production-Ready Microsoft Teams AppsGuillaume Meyer
 
Running Lean Click Tests
Running Lean Click TestsRunning Lean Click Tests
Running Lean Click TestsChris Chan
 
Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production DebuggingTakipi
 
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...South Tyrol Free Software Conference
 

Similar a Verify DBpedia Links Using MTurk (20)

Hangman for the Masses Showcase of Web Tech
Hangman for the Masses Showcase of Web TechHangman for the Masses Showcase of Web Tech
Hangman for the Masses Showcase of Web Tech
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
Tips for Developing and Testing IBM HATS Applications
Tips for Developing and Testing IBM HATS ApplicationsTips for Developing and Testing IBM HATS Applications
Tips for Developing and Testing IBM HATS Applications
 
Rakuten openstack
Rakuten openstackRakuten openstack
Rakuten openstack
 
Devday2016 real unittestingwithmockframework-phatvu
Devday2016 real unittestingwithmockframework-phatvuDevday2016 real unittestingwithmockframework-phatvu
Devday2016 real unittestingwithmockframework-phatvu
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Nexmark with beam
Nexmark with beamNexmark with beam
Nexmark with beam
 
Real time web
Real time webReal time web
Real time web
 
Busy Developers Guide to AngularJS (Tiberiu Covaci)
Busy Developers Guide to AngularJS (Tiberiu Covaci)Busy Developers Guide to AngularJS (Tiberiu Covaci)
Busy Developers Guide to AngularJS (Tiberiu Covaci)
 
Patterns for Scalability in Windows Azure Applications (Alex Mang)
Patterns for Scalability in Windows Azure Applications (Alex Mang)Patterns for Scalability in Windows Azure Applications (Alex Mang)
Patterns for Scalability in Windows Azure Applications (Alex Mang)
 
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...
[DevDay 2016] Real Unit Testing with mocking framework - Speaker: Phat Vu – S...
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model Training
 
Alexandra johnson reducing operational barriers to model training
Alexandra johnson   reducing operational barriers to model trainingAlexandra johnson   reducing operational barriers to model training
Alexandra johnson reducing operational barriers to model training
 
Get together on getting more out of typescript &amp; angular 2
Get together on getting more out of typescript &amp; angular 2Get together on getting more out of typescript &amp; angular 2
Get together on getting more out of typescript &amp; angular 2
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Feedback on building Production-Ready Microsoft Teams Apps
Feedback on building Production-Ready Microsoft Teams AppsFeedback on building Production-Ready Microsoft Teams Apps
Feedback on building Production-Ready Microsoft Teams Apps
 
Running Lean Click Tests
Running Lean Click TestsRunning Lean Click Tests
Running Lean Click Tests
 
Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production Debugging
 
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
 

Más de eswcsummerschool

Semantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projectSemantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projecteswcsummerschool
 
Syrtaki - ESWC SSchool 14 - Student project
Syrtaki  - ESWC SSchool 14 - Student projectSyrtaki  - ESWC SSchool 14 - Student project
Syrtaki - ESWC SSchool 14 - Student projecteswcsummerschool
 
Keep fit (a bit) - ESWC SSchool 14 - Student project
Keep fit (a bit)  - ESWC SSchool 14 - Student projectKeep fit (a bit)  - ESWC SSchool 14 - Student project
Keep fit (a bit) - ESWC SSchool 14 - Student projecteswcsummerschool
 
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projecteswcsummerschool
 
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projectFIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projecteswcsummerschool
 
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
Personal Tours at the British Museum  - ESWC SSchool 14 - Student projectPersonal Tours at the British Museum  - ESWC SSchool 14 - Student project
Personal Tours at the British Museum - ESWC SSchool 14 - Student projecteswcsummerschool
 
Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...eswcsummerschool
 
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...eswcsummerschool
 
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 eswcsummerschool
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014eswcsummerschool
 
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...eswcsummerschool
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01eswcsummerschool
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the schooleswcsummerschool
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataeswcsummerschool
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataeswcsummerschool
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speedeswcsummerschool
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringeswcsummerschool
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02eswcsummerschool
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataeswcsummerschool
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semanticeswcsummerschool
 

Más de eswcsummerschool (20)

Semantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projectSemantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student project
 
Syrtaki - ESWC SSchool 14 - Student project
Syrtaki  - ESWC SSchool 14 - Student projectSyrtaki  - ESWC SSchool 14 - Student project
Syrtaki - ESWC SSchool 14 - Student project
 
Keep fit (a bit) - ESWC SSchool 14 - Student project
Keep fit (a bit)  - ESWC SSchool 14 - Student projectKeep fit (a bit)  - ESWC SSchool 14 - Student project
Keep fit (a bit) - ESWC SSchool 14 - Student project
 
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
 
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projectFIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
 
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
Personal Tours at the British Museum  - ESWC SSchool 14 - Student projectPersonal Tours at the British Museum  - ESWC SSchool 14 - Student project
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
 
Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...
 
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
 
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the school
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage data
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddata
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speed
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineering
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked data
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semantic
 

Verify DBpedia Links Using MTurk

  • 1. Amazon Mechanical Turk Hands-­‐on session Maribel Acosta
  • 2. Amazon Mechanical Turk hands-­‐on session 2
  • 3. MTurk Basic Concepts (1) Requester Worker Source: h?ps://requester.mturk.com/tour/how_it_works • Requester: creates and submits tasks to the pla9orm. • Worker: person who solves the tasks. • Human Intelligence Task (HIT): work unit. Amazon Mechanical Turk hands-­‐on session 3
  • 4. MTurk Basic Concepts HIT (2) Amazon Mechanical Turk hands-­‐on session Project • Project: HIT HTML + HIT metadata • The elements that stay the same in every HIT are denominated template • The data that will vary from HIT to HIT are specified via variables • NOTE: If no variables are specified in the project, we will create a single HIT • Variables: allow creaQng several HITs in the project 4 HIT 4 3 HIT 2 HIT 1
  • 5. MTurk Basic Concepts (3) HIT 4 HIT 3 HIT 2 HIT 1 Batch Amazon Mechanical Turk hands-­‐on session Project • Batch: Group of HITs created by instanQaQng the variable(s) of a project • The values of the variables are specified in (CSV, TSV) files: • Each column corresponds to a variable • Each row is an instance -­‐> HIT • Each file corresponds to a batch • We can create several batches for the same project 5
  • 6. MTurk Basic Concepts (4) HIT 4 HIT 3 HIT 2 HIT 1 Batch HIT 1 HIT HIT 1 Assig.1 HIT 1 Assig.2 Assignments Amazon Mechanical Turk hands-­‐on session Project Q1 Q2 QuesCons • HIT: HIT 1 Assig.3 Work unit. The same HIT can be solved by 1 or more workers (assignments) • Assignment: How many workers should solve one exact same HIT • QuesCons: A single HIT may contain one or several quesQons 6
  • 7. MTurk Basic Concepts (5) HIT 4 HIT 3 HIT 2 HIT 1 Batch HIT 1 HIT Amazon Mechanical Turk hands-­‐on session Project Q1 Q2 QuesCons HIT 1 Assig.1 HIT 1 Assig.2 HI T1 Assig.3 Assignments Total cost of the project = No. of HITs x No. of Assignments x (Reward per HIT + Fee) 7
  • 8. MTurk Basic Concepts (6) Example of Human Intelligence Tasks (HITs) • Projects can be broken into smaller tasks called HITs • A HIT represents a single work unit Tagging (describing) 900 images Amazon Mechanical Turk hands-­‐on session Project: Create tags for image X1 HIT: Create tags for image X2 HIT: … No. of HITS = 900 8
  • 9. MTurk Basic Concepts (7) Example of Human Intelligence Tasks (HITs) • Projects can be broken into smaller tasks called HITs • A HIT represents a single work unit Tagging (describing) 900 images Amazon Mechanical Turk hands-­‐on session Project: Create tags for images X1, X2, X3 HIT: Create tags for image X4, X5, X6 HIT: … Several quesQons in a single HIT! No. of HITS = 300 9
  • 10. MTurk Basic Concepts (8) When creaQng a project or individual HITs, the HIT properCes must be specified: • General informaQon: includes the Qtle and descripQon of the HIT, as well as keywords which are used by worker for searching HITs • HIT duraQon Qme: Qme allo?ed to solve the HIT (before it is given to another worker) • HIT life Qme: how long will the HIT be available on the pla9orm • # Assignments: number of different persons that will perform the exact same HIT • Reward: payment for correctly solving each assignment Amazon Mechanical Turk hands-­‐on session 10
  • 11. MTurk Workflow for Requesters Amazon Mechanical Turk hands-­‐on session Project CreaCon & Design HIT Test (Sandbox) HIT PublicaQon Workers solve the HITs Review of the results Completed project ProducCon site 11 reject all assignments accepted
  • 12. MTurk Sandbox The Sandbox is a simulated MTurk environment to test HITs. • Log in as requester: preview and test the interface of your HITs – h?ps://requestersandbox.mturk.com • Log in as worker: solve your own HITs to test their funcQonaliQes and result output – h?ps://workersandbox.mturk.com • Best pracCce: Always test your HITs (as requester and worker) before publishing them in the producQon site Amazon Mechanical Turk hands-­‐on session 12
  • 13. Managing HITs in MTurk There are three different mechanism to manage your HITs in MTurk: Amazon Mechanical Turk hands-­‐on session API Command Line Tools Web Interface 13
  • 14. MTURK WEB INTERFACE Amazon Mechanical Turk hands-­‐on session 14
  • 15. Hands On! • Project: Crowdsourcing DBpedia triples to verify the links to external web pages prefix dbpedia-­‐ont:<http://dbpedia.org/ontology/> prefix foaf:<http://xmlns.com/foaf/0.1/> SELECT * WHERE { ?s dbpedia-­‐ont:wikiPageExternalLink ?o; foaf:name ?s_name; foaf:isPrimaryTopicOf ?s_wikipage . Amazon Mechanical Turk hands-­‐on session 15 } LIMIT 200 Triple to crowdsource Triples to build the UI http://dbpedia.org/sparql MTurkDemo/data/sparql.csv
  • 16. Hands On! • Go to Mturk Sandbox as a requester: – h?ps://requestersandbox.mturk.com/ • Click on Sign In – Email address: own_tp@gmx.li – Password: sourcrowd • Now we are at “home” Amazon Mechanical Turk hands-­‐on session 16
  • 17. 1. Creating a Project Amazon Mechanical Turk hands-­‐on session Different predefined templates: Select “other” 17
  • 18. 2. Setting up the HIT Properties (1) Amazon Mechanical Turk hands-­‐on session HIT descripQon 18
  • 19. 2. Setting up the HIT Properties (2) Very IMPORTANT: Set up quality mechanisms Masters are selected by default Amazon Mechanical Turk hands-­‐on session HIT properQes 19
  • 20. 3. Selecting Qualifications Worker requirements (filters) Very IMPORTANT: Set up quality mechanisms Masters are selected by default HIT properQes Amazon Mechanical Turk hands-­‐on session 20 Masters expect higher rewards MTurk charges 20% for masters
  • 21. 4. Defining the Task Amazon Mechanical Turk hands-­‐on session WYSIWYG HTML editor 21
  • 22. 4. Defining the Task (with Variables) Template: Amazon Mechanical Turk hands-­‐on session elements that stay the same in every HIT Variables: data that will vary from HIT to HIT. Are denoted as follows: ${var_name} 22
  • 23. 5. Previewing the Template This is what the workers will see Amazon Mechanical Turk hands-­‐on session 23 The variables will be replaced by the input data
  • 24. 6. Creating Batches Amazon Mechanical Turk hands-­‐on session 24
  • 25. 7. Previewing the HITs Amazon Mechanical Turk hands-­‐on session Variables are replaced with the data from the input file 25
  • 26. 8. Publishing the HITs Amazon Mechanical Turk hands-­‐on session Summary of the project: • # of HITs • Rewards • Total payment • Account balance 26
  • 27. 9. Retrieving the Results Amazon Mechanical Turk hands-­‐on session 27
  • 28. MTURK SUMMARY Amazon Mechanical Turk hands-­‐on session 28
  • 29. Project/HIT Creation & Design (1) • The requester is able to create projects or individual HITs • Build user-­‐friendly interfaces (using web technologies) • Then, the HIT properCes must be specified: – General informaQon: includes the Qtle and descripQon of the HIT, as well as keywords which are used by worker for searching HITs. – HIT duraQon Qme: Qme allo?ed to solve the HIT (before it is given to another worker). – HIT life Qme: how long will the HIT be available on the pla9orm. – # Assignments: number of different persons that will perform the same HIT. – Reward: payment for correctly solving each assignment. Amazon Mechanical Turk hands-­‐on session 29
  • 30. Project/HIT Creation & Design (2) • SelecQon of MTurk quality control mechanisms: • These • High quality workers • Masters • Photo moderaQon masters • CategorizaQon masters • System qualificaCons • LocaQon by country • HIT submission rate (%) • HIT approval/rejecQon rate (%) • (Absolute) Number of HITs approved • QualificaCon types • Simply granted or a?ributed via customized tests filters are automaQcally performed by the pla9orm Amazon Mechanical Turk hands-­‐on session Worker requirements Masters expect higher rewards MTurk charges 20% for masters 30
  • 31. HIT Test • Best pracCce: Always test your HITs before publishing them 1. Perform technical tests (both as requester and worker) in the MTurk Sandbox environment. Source: h?ps://requester.mturk.com/developer/sandbox 2. Publish a small subset of tasks in the producQon site to test usability and responsiveness. Amazon Mechanical Turk hands-­‐on session 31
  • 32. Run live HITs • HIT publicaCon: – Make the HITs available to the workers • Review the results: – Monitor the submi?ed assignments constantly – Download the results – Accept/reject assignments, provide feedback when rejecQng – Block spammers (opQonal) • Update HIT/Project: – Extend/expire HITs or modify other HIT properQes – Add addiQonal assignments Amazon Mechanical Turk hands-­‐on session 32
  • 33. Lessons Learned • Introduce yourself on Worker forums (regular requester) • Be responsive to workers – Reply to emails with quesQons about tasks • Use monitoring tools: – Forums – TurkopQcon (Source:h?p://turkopQcon.differenceengines.com/)
  • 34. Choosing the Right Tool Source: h?ps://requestersandbox.mturk.com/tour/choose_the_right_tool Amazon Mechanical Turk hands-­‐on session 34
  • 35. IS THERE MORE THAN MTURK? Amazon Mechanical Turk hands-­‐on session 35
  • 36. Amazon Mechanical Turk hands-­‐on session 36
  • 37. CrowdFlower Platform CrowdFlower • Client: CrowdFlower contributors Creates and submits jobs (MTurk = requester) • Contributor: person who solves the jobs (MTurk = worker) • Job: unit work (MTurk = task) Amazon Mechanical Turk hands-­‐on session 37 client
  • 38. Why CrowdFlower? (1) Amazon Mechanical Turk hands-­‐on session 38 Neat UI
  • 39. Why CrowdFlower? (2) Quality control mechanisms Allows for easily creaQng a “gold standard”, which is further used to detect low quality workers Amazon Mechanical Turk hands-­‐on session 39
  • 40. Why CrowdFlower? (3) Report generaCon and analyCcs Amazon Mechanical Turk hands-­‐on session 40
  • 41. Why NOT CrowdFlower? • At the beginning, clients must wait unQl their projects are approved by the CrowdFlower staff before publishing them – Wait Qme: From a couple of hours up to (5)* days • Jobs must be specified in a non-­‐standard language: – CML: CrowdFlower Markup Language • There are certain configuraCons that cannot be executed in the pla9orm *Personal experience of the presenter Amazon Mechanical Turk hands-­‐on session 41
  • 42. References • AMT. Geyng Started Guide. API Version 2012-­‐03-­‐25 h?p://s3.amazonaws.com/awsdocs/MechTurk/latest/amt-­‐gsg.pdf • The Mechanical Turk Blog h?p://mechanicalturk.typepad.com/ • MTurk Java API h?p://people.csail.mit.edu/gli?le/MTurkJavaAPI/ • CrowdFlower Pla9orm h?p://crowdflower.com Amazon Mechanical Turk hands-­‐on session 42