Unleash Your Potential - Namagunga Girls Coding Club
CROSER
1. Cross-language
Linking of eGov
Services to the LOD
Cloud
Fedelucio Narducci*, Matteo Palmonari*, Giovanni Semeraro°
*DISCO, University of Milan-Bicocca, Italy
°Department of Computer Science, University of Bari Aldo Moro, Italy
Semantic
Web
Access &
Personalization
research group “Antonio Bello”
http://www.di.uniba.it/~swap
AI for Smart Cities Workshop
AI*IA 2013 - 25th Year Anniversary
XIII Conference
Turin (Italy), December 5, 2013
2. EC 6 axes
to evaluate ‘city smartness’
http://www.smart-cities.eu/model.html
2
3. Cross-language Linking
of Open Government Data
• A large amount of Open Government Data in many
languages*:
o 1,000,000+ datasets published online (February 2013)
o 40 different countries
o 24 different languages
*http://logd.tw.rpi.edu/iogds_data_analytics
3
4. Cross-language Linking
of eGov Service Descriptions
• Government service catalogs are part of
the LOD cloud
o Effective Service Delivery (ESD)-toolkit
o European Local Government Service List (LGSL)
• 2000+ interlinked public services in 6 languages
4
5. Cross-language Linking of eGov Services
Why is it useful?
• Advantages for PAs
o Compare local service offerings with best practices in other countries
o Support interoperability among PAs of different countries and other
service providers
o Enrich service descriptions with additional information via links to LGSL
(e.g., link to life event ontologies)
• Advantages for citizens
o Find eGov services when in a foreign country
o Towards cross-language service access
Costly and Error Prone Activity
Catalogs of several hundreds of services
5
6. Cross-language Linking of eGov Services
Why is it challenging?
≈ sameAs links
•
•
Semantic heterogeneity
o
o
Challenging cross-language matching
problem
Most of the approaches:
•
not a mere “translation”
problem
cultural bias
•
Ultra-short descriptions
use structural information [Spohr et al.
2011, Fu et al. 2011, Wang et al. 2009] or
long textual descriptions [Knoth et al.
2011]
or report problems when automatic
translation returns descriptions with
heterogeneous vocabulary [Hertling &
Paulheim 2012]
6
7. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
English
o Based on automatic translation and Explicit Semantic Analysis
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
TRY IT @ http://siti-rack.siti.disco.unimib.it:8080/croser/
7
8. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
English
o Based on automatic translation and Explicit Semantic Analysis
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
8
9. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
• Load a catalog
English
o Based on automatic translation and Explicit Semantic Analysis
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
9
10. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
• Load a catalog
English
o Based on automatic translation and Explicit Semantic Analysis
• Select a source
service
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
10
11. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
• Load a catalog
English
o Based on automatic translation and Explicit Semantic Analysis
• Select a source
service
• Look at the retrieved
services
(link recommendations)
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
11
12. CroSeR
Cross-language Service Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
• Load a catalog
English
o Based on automatic translation and Explicit Semantic Analysis
• Select a source
service
• Look at the retrieved
services
(link recommendations)
• Link
SKOS broader / exact / narrower
match
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
12
13. Cross-language Service
Retriever
• CroSeR
o A tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in
English
o Based on automatic translation and Explicit Semantic Analysis
Web tool to support the linkage of a source eGov service catalog
represented in any language to a target catalog represented in English
Based on Machine Translation and Explicit Semantic Analysis (ESA)
13
15. Experiments:
Discussion
• CroSeR finds matchings that cannot be discovered by
machine translation + keyword comparison
• CroSeR’s recommendations can support the users to refine the
links
GT:“Absentee Ballot”
16
16. References (for details)
• Model and Experimental Evaluation @ISWC 2013
o F. Narducci, M. Palmonari, G. Semeraro. Cross-Language Semantic
Retrieval and Linking of E-Gov Services. The Semantic Web - ISWC 2013 12th International Semantic Web Conference, Sydney, NSW, Australia,
October 21-25, 2013, LNCS 8219, 130-145, Springer, 2013
• System Demo @ECIR 2014
o F. Narducci, M. Palmonari, G. Semeraro. CroSeR: Cross-language
Semantic Retrieval of Open Government. 36th European Conference on
Information Retrieval, Amsterdam, the Netherlands, April 13-16, 2014. To
Appear
17
Notas del editor
The matching method adopted by CroSeRto provide meaningful link recommendations to the user is based on three main steps:Service descriptions in languages different from English are translated in English using machine translation tools (Bing APIs)The descriptions translated in English are analyzed by CroSeR using Explicit Semantic Analysis and are indexed using the Vector Space Model, thus each service is represented as a vector of weights in a multi-dimensional spaceGiven an input service, a set of top-k best matching services in the target catalog are retrieved; the matching score for a couple of source and target service is defined by the cosine similarity between the descriptions of the services (both described in the Vector Space Model)Explicit Semantic Analysis describes documents in a Vector Space Model defined using Wikipedia as reference knowledge.
“Arbitrati e conciliazioni” (translated as “Arbitrations and conciliations”) –> “Legal -litigation support”3 - CroSeR is able to find matchings that cannot be discovered by simple Machine Translation + keyword comparison; 2 examples are shown in the figure [see the description in the upper most figure; literal translation to English are shown in round brackets]4 – We also notice that we can find several CroSeR recommendations that are meaningful even when not correct wrt the Gold standard. As an example, “Briefwahl”, literally translated as “Absentee Ballot” has been mapped to “Postal Voting” by domain experts. However (according to Wikipedia), Postal Voting and Proxy Voting are two established practices of Absentee Ballot. In this case, it seems that establishing two broaderMatch links between Briefwahl and Proxy Voting, and between Briefwahl and Postal Voting, would be a much more appropriate choice. Although no service described as “Absentee Ballot” occurs in the target service catalog, CroSeR recommends both Proxy Voting and Postal Voting as possible matches for Briefwahl. Using CroSeR, the PA user could modify their links and make more appropriate choices.