Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game
1. Urbanopoly: Collection and Quality
Assessment of Geo-spatial Linked Data
via a Human Computation Game
Irene Celino, Dario Cerizza, Simone Contessa,
Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle,
Stefano Fumeo and Federico Piccinini
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
3. What about those citizens?
Gareth 1953 http://www.flickr.com/photos/gareth1953/6786545520/
Boris van Hoytema http://www.flickr.com/photos/borisvanhoytema/685879933/
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
paul_houle http://www.flickr.com/photos/paul_houle/3301438074/
graziano88 http://www.flickr.com/photos/8482460@N06/6884509346/
3
Urbanopoly
4. Citizen Computation
Human Computation
Citizen Science
exploiting human capabilities
to solve computational tasks
difficult for machines
exploiting volunteers
to collect scientific data or
to conduct experiments
"in the world"
Citizen Computation
exploiting human capabilities
to contribute to a mixed
computational system
by living "in the world"
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
4
Urbanopoly
5. Human Computation Games with a Purpose
Purpose
within
the game:
Purpose outside the game:
You help image search engines by manually tagging images
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
5
Urbanopoly
6. Citizen Computation Game with a Purpose
Purpose within
the game:
Purpose outside
the game:
Create your venues' portfolio and become
the greatest landlord ever!
Collect and verify information about your city
by playing with the neighborhood around you
http://bit.ly/urbanopoly
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
6
Urbanopoly
7. Urbanopoly – high-level view
Game purpose: check and correct geo-spatial data
from pre-existing sources + collect missing data
game to buy / sell
venues with missions
LinkedGeoData +
Lombardia Open Data
2
players
data about
venues as
missions
1
bootstrap of
"venues" data
verified / improved data
+ new data
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
GWAP approach to
3
consolidate data
4
7
Urbanopoly
8. Urbanopoly Input Data
OpenStreetMap (OSM)
http://www.openstreetmap.org/
via LinkedGeoData (LGD)
http://linkedgeodata.org/
data as linked data, described by an ontology
Lombardia Open Data
https://dati.lombardia.it
data about "agriturismo" places as CSV converted to RDF
Urbanopoly data bootstrap: venues are "instances" of
selected LGD "classes" with their OSM tags as features,
thus Urbanopoly data are RDF statements of the form:
<venue> <feature> <value>
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
8
Urbanopoly
10. Urbanopoly mini-games for Data Collection
data acquisition challenges as
contributions to an advertising campaign
– left: inserting a value,
right: taking a picture
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
data validation challenges to check
pre-existing data or other players’
contribution – left: answering a quiz,
right: rating a poster
10
Urbanopoly
11. Urbanopoly Data Consolidation
Each statement has a confidence score:
{ <venue> <feature> <value> . } <confidence>
which indicates the probability of the statement to be true
Each player action is taken as an evidence of the associated
knowledge and alters the confidence score
A weighted majority voting algorithm aggregates the evidences:
Player’s reputation (e.g., number of errors)
Difficulty to acquire the contribution (e.g., typing vs. check box)
Player’s distance to the venue at contribution time (as sensed by the device)
When the confidence score overcomes a threshold, the triple
<venue> <feature> <value> gets consolidated
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
11
Urbanopoly
12. Urbanopoly Data Publication
True statements published as linked open data
If a statement's confidence overcomes the threshold, the statement
is asserted: <venue> <feature> <value> (as in LGD/OSM)
But there's more interesting information to publish!
False statements, statements' confidence, provenance info, etc.
provo:Entity
provo:Agent
We published this further
knowledge as annotations to the reified
<venue> <feature> <value> statements
contributionFrom
Contribution
Contributor
solvedBy
Human
Computation
Task
12
Consolidated
Information
enabledBy
Human
Computation
Algorithm
provo:Activity
Cf. http://swa.cefriel.it/linkeddata/
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
aggregatedFrom
aggregatedBy
We created a Human Computation
ontology (http://swa.cefriel.it/ontologies/hc)
extending the W3C PROV-O ontology
solutionTo
Urbanopoly
13. Urbanopoly Evaluation (1/2)
"Enjoyability" of the game (engagement potential):
"Effectiveness" of the GWAP mechanism:
Average life play: ALP = Played Time / Active Players
~ 100 minutes very good result
Throughput = Solved Problems / Played Time
~ 287 collected evidences / hour very good
~ 5 consolidated statements / hour can be improved
"Precision" of the results (measured on results' subset)
Accuracy = ( (P – FP) + (N – FN) ) / (P + N)
~ 92 % very good result
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
13
Urbanopoly
14. Urbanopoly Evaluation (2/2)
"Playability" of the game
Evaluation survey at http://bit.ly/u-survey, with questions about
usability, social aspects, physical presence, motivation, etc.
Feedbacks very encouraging
"Sociability" through Facebook channel
With Facebook Insights (http://www.facebook.com/insights/),
tracking of installs, demographics,
log-ins, content sharing, etc.
Example of published "story" on
Facebook Timeline:
Statistics about "stories" and
"impressions":
Interesting results, but channel to be further exploited
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
14
Urbanopoly
15. Conclusions
Urbanopoly is an end-user mobile application with
a multi-language attractive user interface
Urbanopoly manages urban data at a real scale
(ca. 50,000 venues) from heterogeneous sources
The meaning of data is core to the application and
consolidated data are published as linked open data
Urbanopoly is aimed at geo-spatial data collection and
quality assurance, especially for dynamic data
Our rigorous evaluation shows the high accuracy of results
and feasibility of the approach
Urbanopoly shows a clear commercial potential: further
data collection or validation needs can be added as further
mini-games or challenges within the game
Semantic Web Challenge @ ISWC 2012 - 2012/11/14
15
Urbanopoly
16. Thanks for your attention!
Questions?
Keep on playing
Urbanopoly!
Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano
email: Irene.Celino@cefriel.it – web: http://swa.cefriel.it
slides at: http://www.slideshare.net/iricelino
Semantic Web Challenge @ ISWC 2012 - 2012/11/14