AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
Aaai2012
1. Crowdsourcing tasks in open query answering
Elena Simperl,1 Barry Norton,2 Denny Vrandecic1
1Institute AIFB, Karlsruhe Institute of Technology, Germany
2Ontotext AD, Bulgaria
Institute of Applied Informatics and Formal Description Methods (AIFB)
Institute of Applied Informatics and Formal Description Methods (AIFB)
KIT – University of the State of Baden-Wuerttemberg and
National Research Center of the Helmholtz Association www.kit.edu
2. Background: what is Linked Data?
Linked Data: set of best
practices to publish and
connect structured data on
the Web.
URIs to identify entities and
concepts in the world
HTTP to access and retrieve
resources and descriptions of
these resources
RDF as generic graph-based
data model to structure and link
data
Taken together Linked Data
is said to form a ‘cloud’ of
shared references and
vocabularies.
http://linkeddata.org/faq
2 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
3. Background: why is Linked Data important?
Data.gov & public sector information: BBC & media: added value of
more transparency and accountability in
governance content through interlinking
Google, Yahoo, Bing & schema.org:
enhanced search
3 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
4. Crowdsourcing Linked Data management
Tasks requiring human contributions
Interlinking
Conceptual modeling
Labeling and translation
Classification
Ordering
Crowdsourcing already in use
4 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
5. Example: open query answering
Query FOAF data using the vCard vocabulary
hp:Harry foaf:mbox <mailto:scarface@hogwarts.ac.uk> ;
foaf:nick "Harry" ; foaf:familyName "Potter" .
SELECT ?name ?email WHERE
{ ?p vcard:email ?email ; vcard:fn ?name }
In order to answer the query as intended
Vocabulary mapping and entity resolution (FOAF to
vCard)
Metadata completion (full name is “Harry Potter”)
5 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
6. Crowdsourcing-enabled query answering
• Integral part of a query engine
At design time application
developer specifies which data
portions workers can process
and via which types of HITs
At run time
The system materializes the
data
Workers process it
Data and application are
updated to reflect
crowdsourcing results
Formal, declarative
description of the data and
tasks using SPARQL patterns
as a basis for the automatic
design of HITs
Reducing the number of tasks
through automatic reasoning
6 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
7. Example: Identity resolution
Identity resolution involves the creation of links,
either by comparison of metadata or by investigation
of links on the human Web.
Input: {?station a metar:Station;
rdfs:label ?slabel;
wgs84:lat ?slat;
wgs84:long ?slong .
?airport a dbp-owl:Airport;
rdfs:label ?alabel;
wgs84:lat ?alat;
wgs84:long ?along}
Output: {OPTIONAL
{?airport owl:sameAs ?station}}
7 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
8. Example: Classification
Classification of entities to classes cannot be always
automatically inferred from the schema.
Input: {?station a metar:Station;
rdfs:label ?label;
wgs84:lat ?lat;
wgs84:long ?long}
Output: {?station a ?type.
?type rdfs:subClassOf
metar:Station}
8 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
9. Challenges
Decomposition of queries
Query optimisation obfuscates what is used and should involve costs
for human tasks
Query execution and caching
Naively we can materialise HIT results into datasets
How to deal with partial coverage and dynamic datasets
Appropriate level of granularity for HITs design for specific
SPARQL constructs and typical functionality of Linked Data
management components
Optimal user interfaces of graph-like content
(Contextual) Rendering of LOD entities and tasks
Pricing and workers’ assignment
Can we connect the end-users of an application and their wish for
specific data to be consumed with the payment of workers and
prioritization of HITs?
Dealing with spam / gaming
9 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
10. QUESTIONS
10 07.06.2012 Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)