SlideShare una empresa de Scribd logo
1 de 63
Descargar para leer sin conexión
ISWC 2017, Vienna, Austria
AMUSE: Multilingual Semantic Parsing for
Question Answering over Linked Data
Sherzod Hakimov, Soufian Jebbara & Philipp Cimiano
Semantic Computing Group
CITEC, Bielefeld University
1
ISWC 2017 Vienna, Austria
Virtual Assistants
2
Siri
Google Now
Alexa
Cortana
ISWC 2017 Vienna, Austria
Multilingual -Factoid Question Answering
3
ISWC 2017 Vienna, Austria
Multilingual -Factoid Question Answering
4
ISWC 2017 Vienna, Austria
Multilingual -Factoid Question Answering
5
ISWC 2017 Vienna, Austria
Problem Definition -Question Answering
SELECT ?x WHERE {
dbr:Wikipedia dbo:author ?x .
}
6
Who created Wikipedia?
ISWC 2017 Vienna, Austria
Problem Definition -Question Answering
SELECT ?x WHERE {
dbr:Wikipedia dbo:author ?x .
}
7
Who created Wikipedia?
Wer hat Wikipedia gegründet?
¿Quién creó Wikipedia?
ISWC 2017 Vienna, Austria
Problem Definition -Question Answering
SELECT ?x WHERE {
dbr:Wikipedia dbo:author ?x .
}
8
Who created Wikipedia?
Wer hat Wikipedia gegründet?
¿Quién creó Wikipedia?
ISWC 2017 Vienna, Austria
Problem Definition -Question Answering
SELECT ?x WHERE {
dbr:Wikipedia dbo:author ?x .
}
9
Who created Wikipedia?
Wer hat Wikipedia gegründet?
¿Quién creó Wikipedia?
QALD - 7 Multilingual Question Answering Dataset, ESWC 2017
8 languages
215 Train instances
44 Test instances
ISWC 2017 Vienna, Austria
Motivation
10
Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia?
ISWC 2017 Vienna, Austria
Motivation
11
Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia?
Universal Dependencies v2, 50 languages
ISWC 2017 Vienna, Austria
Knowledge Base -DBpedia
! 2016-04 release
! 125 languages
! 754 classes
! 1,103 object properties
! 1,608 datatype properties
12
ISWC 2017 Vienna, Austria
Preliminaries
Logical Form
Semantic Composition using Dependency Parse Tree
Factor Graphs
13
ISWC 2017 Vienna, Austria
Logical Form -DUDES
• Dependency-based Underspecified Discourse Representation Structures (Cimiano et al [1])
• Formalism for specifying meaning representation
• Flexible semantic composition w.r.t order of application
• Build on semantic dependencies e.g. suitable for working with dependency-based syntactic
analysis
14
[1] Cimiano, P., 2009, Flexible semantic composition with DUDES. In Proceedings of the Eighth International Conference on Computational Semantics (pp. 272-276). Association for Computational
Linguistics.
ISWC 2017 Vienna, Austria
DUDES
15
v : is the main variable
vs : is a set of variables (possibly empty), the projection variables
l : is the label of the main DRS
drs : is a DRS (Discourse Representation Structure)
slots : is a set of semantic dependencies (possibly empty)
ISWC 2017 Vienna, Austria
DUDES
16
v : is the main variable
vs : is a set of variables (possibly empty), the projection variables
l : is the label of the main DRS
drs : is a DRS (Discourse Representation Structure)
slots : is a set of semantic dependencies (possibly empty)
ISWC 2017 Vienna, Austria
DUDES
17
v : is the main variable
vs : is a set of variables (possibly empty), the projection variables
l : is the label of the main DRS
drs : is a DRS (Discourse Representation Structure)
slots : is a set of semantic dependencies (possibly empty)
ISWC 2017 Vienna, Austria
DUDES
18
v : is the main variable
vs : is a set of variables (possibly empty), the projection variables
l : is the label of the main DRS
drs : is a DRS (Discourse Representation Structure)
slots : is a set of semantic dependencies (possibly empty)
ISWC 2017 Vienna, Austria
Semantic Composition with DUDES
Who created Wikipedia?
19
ISWC 2017 Vienna, Austria
Semantic Composition with DUDES
20
ISWC 2017 Vienna, Austria
Semantic Composition with DUDES -Bottom Up
21
ISWC 2017 Vienna, Austria
Semantic Composition with DUDES -Bottom Up
22
ISWC 2017 Vienna, Austria 23
Semantic Composition with DUDES -Bottom Up
ISWC 2017 Vienna, Austria 24
Semantic Composition with DUDES -Bottom Up
ISWC 2017 Vienna, Austria 25
Semantic Composition with DUDES -Bottom Up
ISWC 2017 Vienna, Austria 26
Semantic Composition with DUDES -Bottom Up
ISWC 2017 Vienna, Austria
Semantic Composition with DUDES
Who created Wikipedia?
27
ISWC 2017 Vienna, Austria
Logical Form into SPARQL
28
ISWC 2017 Vienna, Austria
Factor Graphs
29
Observed Variables : Nodes, Relations Hidden Variables : KB Ids, Logical Forms, Slots
ISWC 2017 Vienna, Austria
Approach
- Semantic Parsing using dependency parse tree
- Language independent pipeline
- Model based on factor graphs
- SampleRank to optimise features
- Inference strategy based on MCMC (Markov Chain Monte Carlo)
30
ISWC 2017 Vienna, Austria
Approach
31
ISWC 2017 Vienna, Austria
Approach
32
Inference
Semantic Composition
Query Construction
ISWC 2017 Vienna, Austria
Inference -Markov Chain Monte Carlo
33
Stack Decoding
ISWC 2017 Vienna, Austria
Inference
34
Initial State
ISWC 2017 Vienna, Austria
Inference
35
Initial State
m - sampling steps
ISWC 2017 Vienna, Austria
Inference
36
Initial State Sampled State
m - sampling steps
ISWC 2017 Vienna, Austria
Inference
2 strategies to explore the search space
1) Linking to Knowledge Base (L2KB)
2) Query Construction (QC)
37
ISWC 2017 Vienna, Austria
Inference
2 strategies to explore the search space
1) Linking to Knowledge Base (L2KB)
• objective : compare set of URIs to the expected set of URIs
2) Query Construction (QC)
• objective : compare the constructed query to the expected query
38
ISWC 2017 Vienna, Austria
Linking to Knowledge Base (L2KB)
Explore the edges and assign Knowledge Base IDs based on lemmas of nodes
39
ISWC 2017 Vienna, Austria
Linking to Knowledge Base (L2KB)
Explore the edges and assign Knowledge Base IDs based on lemmas of nodes
Check the triple pattern- ?x dbo:author dbr:Wikipedia : Slot 2, dbr:Wikipedia dbo:author ?x : Slot1
40
ISWC 2017 Vienna, Austria
Linking to Knowledge Base (L2KB)
41
ISWC 2017 Vienna, Austria
Query Construction (QC)
Input : sampled state(s) from L2KB
42
ISWC 2017 Vienna, Austria
Query Construction (QC)
Assign DUDES with Return Variable and KB ID to nodes and assign remaining slots
43
ISWC 2017 Vienna, Austria
Features
Edge-based
Lemma, POS, Dependency-Relation, Slot number, KB ID, rdfs:domain/range restrictions
Similarity (Lemma, KB Id)
44
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
45
Who is the writer of The Hunger Games ?
Give me all movies with Tom Cruise. movies — rdf:type dbo:Film
writer — dbo:author
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
46
DBpedia Labels
MATOLL[1]
Word Embeddings[2]
[1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015
[2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
47
DBpedia Labels
• rdfs:label translated from English to Spanish and German using dict.cc
MATOLL[1]
Word Embeddings[2]
[1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015
[2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
48
DBpedia Labels
• rdfs:label translated from English to Spanish and German using dict.cc
MATOLL[1]
• English, German and Spanish lexica
Word Embeddings[2]
[1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015
[2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
49
DBpedia Labels
• rdfs:label translated from English to Spanish and German using dict.cc
MATOLL[1]
• English, German and Spanish lexica
Word Embeddings[2]
• Skip-gram model with 100 dimensions trained on Wikipedia text (for 3
languages)
• Cosine similarity between rdfs:label and mention text
[1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015
ISWC 2017 Vienna, Austria
Addressing The Lexical Gap
50
Word Embeddings
• Skip-gram model with 100 dimensions trained on Wikipedia text (for 3
languages)
• Cosine similarity between rdfs:label and mention text
ISWC 2017 Vienna, Austria
Evaluation
51
• Lexicon
• QA
ISWC 2017 Vienna, Austria
Evaluation -Lexicon
- Gold standard : manually annotated lexicon (only DBpedia Ontology properties and classes)
52
ISWC 2017 Vienna, Austria
Evaluation -Lexicon
- Gold standard : manually annotated lexicon (only DBpedia Ontology properties and classes)
53
English German Spanish
ISWC 2017 Vienna, Austria
Evaluation -Question Answering
- Model trained and tested on QALD-6
- Evaluated L2KB and QC separately
54
ISWC 2017 Vienna, Austria
Evaluation -Question Answering
55DBlex : MATOLL lexica, Dict: Manually created lexica
ISWC 2017 Vienna, Austria
Evaluation -Question Answering
56DBlex : MATOLL lexica, Dict: Manually created lexica
ISWC 2017 Vienna, Austria
Evaluation -Question Answering
57DBlex : MATOLL lexica, Dict: Manually created lexica
ISWC 2017 Vienna, Austria
System Errors
58
• Property (%48)
Who wrote the song Hotel California? - dbo:musicalArtist for song instead of the dbo:writer

• Resource (%30)
When did the Boston Tea Party take place? - The resource wasn’t found

• Query Type (%12)
Where does Piccadilly start? - wrongly infers that this is an ASK-query

• Slot (%10)
How many people live in Poland? - Poland is inferred to fill the 2nd slot instead of the 1st slot of dbo:populationTotal
ISWC 2017 Vienna, Austria
Conclusion
- Multilingual Semantic Parsing approach based on factor graphs
- Model generalises well even trained with only 161 instances
- Language independent
59
ISWC 2017 Vienna, Austria
Future directions
- Improve results by adding additional inference layers e.g. Query Type Classification
- Apply different ranking functions e.g. Regression, Pair-wise State comparison
- Add more lexical knowledge from other sources
- Paraphrasing the training questions to learn from more data
60
ISWC 2017 Vienna, Austria
Thanks!
61
@sherzodhakimov
shakimov AT techfak.uni-bielefeld.de
ISWC 2017 Vienna, Austria
Approach
62
Observed Variables : Dependency parse tree nodes, relations
Hidden Variables : Assigned KB IDs, DUDES, Slots
Factor Graphs
Formal Definition:
! A factor graph G consists of variables V and factors Ψ . Variables can be subdivided into observed variables
xi and hidden variables yi.
! A factor connects subsets of observed and hidden variables and computes a scalar score based on a
feature vector fi(xi,yi) and a set of parameters θi:
● The probability for y for a given input x can be calculated as:
[1] Hakimov, S., ter Horst, H., Jebbara, S., Hartung, M., Cimiano, P.: Combining Textual and Graph-based Features for Named Entity Disambiguation Using Undirected Probabilistic Graphical Models.
[2] Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor Graphs and Sum Product Algorithm.
[3] ter Horst, H., Hartung, M. and Cimiano, P. :Joint Entity Recognition and Linking in Technical Domains Using Undirected Probabilistic Graphical Models. LKD 2017
63
What is the goal ?
Find y that maximizes the posterior distribution p(y|x; θ)

Más contenido relacionado

Similar a Multilingual qa

Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
FedorNikolaev
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 

Similar a Multilingual qa (20)

Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 
Word Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text ClassificationWord Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text Classification
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglio
 
Unveiling the knowledge in knowledge graphs
Unveiling the knowledge in knowledge graphsUnveiling the knowledge in knowledge graphs
Unveiling the knowledge in knowledge graphs
 
On The Evolution of CAEX: A Language Engineering Perspective
On The Evolution of CAEX: A Language Engineering PerspectiveOn The Evolution of CAEX: A Language Engineering Perspective
On The Evolution of CAEX: A Language Engineering Perspective
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
 
AINL 2016: Malykh
AINL 2016: MalykhAINL 2016: Malykh
AINL 2016: Malykh
 
Reconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VECReconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VEC
 
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
 
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان دادهمعرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Último (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 

Multilingual qa

  • 1. ISWC 2017, Vienna, Austria AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data Sherzod Hakimov, Soufian Jebbara & Philipp Cimiano Semantic Computing Group CITEC, Bielefeld University 1
  • 2. ISWC 2017 Vienna, Austria Virtual Assistants 2 Siri Google Now Alexa Cortana
  • 3. ISWC 2017 Vienna, Austria Multilingual -Factoid Question Answering 3
  • 4. ISWC 2017 Vienna, Austria Multilingual -Factoid Question Answering 4
  • 5. ISWC 2017 Vienna, Austria Multilingual -Factoid Question Answering 5
  • 6. ISWC 2017 Vienna, Austria Problem Definition -Question Answering SELECT ?x WHERE { dbr:Wikipedia dbo:author ?x . } 6 Who created Wikipedia?
  • 7. ISWC 2017 Vienna, Austria Problem Definition -Question Answering SELECT ?x WHERE { dbr:Wikipedia dbo:author ?x . } 7 Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia?
  • 8. ISWC 2017 Vienna, Austria Problem Definition -Question Answering SELECT ?x WHERE { dbr:Wikipedia dbo:author ?x . } 8 Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia?
  • 9. ISWC 2017 Vienna, Austria Problem Definition -Question Answering SELECT ?x WHERE { dbr:Wikipedia dbo:author ?x . } 9 Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia? QALD - 7 Multilingual Question Answering Dataset, ESWC 2017 8 languages 215 Train instances 44 Test instances
  • 10. ISWC 2017 Vienna, Austria Motivation 10 Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia?
  • 11. ISWC 2017 Vienna, Austria Motivation 11 Who created Wikipedia? Wer hat Wikipedia gegründet? ¿Quién creó Wikipedia? Universal Dependencies v2, 50 languages
  • 12. ISWC 2017 Vienna, Austria Knowledge Base -DBpedia ! 2016-04 release ! 125 languages ! 754 classes ! 1,103 object properties ! 1,608 datatype properties 12
  • 13. ISWC 2017 Vienna, Austria Preliminaries Logical Form Semantic Composition using Dependency Parse Tree Factor Graphs 13
  • 14. ISWC 2017 Vienna, Austria Logical Form -DUDES • Dependency-based Underspecified Discourse Representation Structures (Cimiano et al [1]) • Formalism for specifying meaning representation • Flexible semantic composition w.r.t order of application • Build on semantic dependencies e.g. suitable for working with dependency-based syntactic analysis 14 [1] Cimiano, P., 2009, Flexible semantic composition with DUDES. In Proceedings of the Eighth International Conference on Computational Semantics (pp. 272-276). Association for Computational Linguistics.
  • 15. ISWC 2017 Vienna, Austria DUDES 15 v : is the main variable vs : is a set of variables (possibly empty), the projection variables l : is the label of the main DRS drs : is a DRS (Discourse Representation Structure) slots : is a set of semantic dependencies (possibly empty)
  • 16. ISWC 2017 Vienna, Austria DUDES 16 v : is the main variable vs : is a set of variables (possibly empty), the projection variables l : is the label of the main DRS drs : is a DRS (Discourse Representation Structure) slots : is a set of semantic dependencies (possibly empty)
  • 17. ISWC 2017 Vienna, Austria DUDES 17 v : is the main variable vs : is a set of variables (possibly empty), the projection variables l : is the label of the main DRS drs : is a DRS (Discourse Representation Structure) slots : is a set of semantic dependencies (possibly empty)
  • 18. ISWC 2017 Vienna, Austria DUDES 18 v : is the main variable vs : is a set of variables (possibly empty), the projection variables l : is the label of the main DRS drs : is a DRS (Discourse Representation Structure) slots : is a set of semantic dependencies (possibly empty)
  • 19. ISWC 2017 Vienna, Austria Semantic Composition with DUDES Who created Wikipedia? 19
  • 20. ISWC 2017 Vienna, Austria Semantic Composition with DUDES 20
  • 21. ISWC 2017 Vienna, Austria Semantic Composition with DUDES -Bottom Up 21
  • 22. ISWC 2017 Vienna, Austria Semantic Composition with DUDES -Bottom Up 22
  • 23. ISWC 2017 Vienna, Austria 23 Semantic Composition with DUDES -Bottom Up
  • 24. ISWC 2017 Vienna, Austria 24 Semantic Composition with DUDES -Bottom Up
  • 25. ISWC 2017 Vienna, Austria 25 Semantic Composition with DUDES -Bottom Up
  • 26. ISWC 2017 Vienna, Austria 26 Semantic Composition with DUDES -Bottom Up
  • 27. ISWC 2017 Vienna, Austria Semantic Composition with DUDES Who created Wikipedia? 27
  • 28. ISWC 2017 Vienna, Austria Logical Form into SPARQL 28
  • 29. ISWC 2017 Vienna, Austria Factor Graphs 29 Observed Variables : Nodes, Relations Hidden Variables : KB Ids, Logical Forms, Slots
  • 30. ISWC 2017 Vienna, Austria Approach - Semantic Parsing using dependency parse tree - Language independent pipeline - Model based on factor graphs - SampleRank to optimise features - Inference strategy based on MCMC (Markov Chain Monte Carlo) 30
  • 31. ISWC 2017 Vienna, Austria Approach 31
  • 32. ISWC 2017 Vienna, Austria Approach 32 Inference Semantic Composition Query Construction
  • 33. ISWC 2017 Vienna, Austria Inference -Markov Chain Monte Carlo 33 Stack Decoding
  • 34. ISWC 2017 Vienna, Austria Inference 34 Initial State
  • 35. ISWC 2017 Vienna, Austria Inference 35 Initial State m - sampling steps
  • 36. ISWC 2017 Vienna, Austria Inference 36 Initial State Sampled State m - sampling steps
  • 37. ISWC 2017 Vienna, Austria Inference 2 strategies to explore the search space 1) Linking to Knowledge Base (L2KB) 2) Query Construction (QC) 37
  • 38. ISWC 2017 Vienna, Austria Inference 2 strategies to explore the search space 1) Linking to Knowledge Base (L2KB) • objective : compare set of URIs to the expected set of URIs 2) Query Construction (QC) • objective : compare the constructed query to the expected query 38
  • 39. ISWC 2017 Vienna, Austria Linking to Knowledge Base (L2KB) Explore the edges and assign Knowledge Base IDs based on lemmas of nodes 39
  • 40. ISWC 2017 Vienna, Austria Linking to Knowledge Base (L2KB) Explore the edges and assign Knowledge Base IDs based on lemmas of nodes Check the triple pattern- ?x dbo:author dbr:Wikipedia : Slot 2, dbr:Wikipedia dbo:author ?x : Slot1 40
  • 41. ISWC 2017 Vienna, Austria Linking to Knowledge Base (L2KB) 41
  • 42. ISWC 2017 Vienna, Austria Query Construction (QC) Input : sampled state(s) from L2KB 42
  • 43. ISWC 2017 Vienna, Austria Query Construction (QC) Assign DUDES with Return Variable and KB ID to nodes and assign remaining slots 43
  • 44. ISWC 2017 Vienna, Austria Features Edge-based Lemma, POS, Dependency-Relation, Slot number, KB ID, rdfs:domain/range restrictions Similarity (Lemma, KB Id) 44
  • 45. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 45 Who is the writer of The Hunger Games ? Give me all movies with Tom Cruise. movies — rdf:type dbo:Film writer — dbo:author
  • 46. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 46 DBpedia Labels MATOLL[1] Word Embeddings[2] [1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015 [2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
  • 47. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 47 DBpedia Labels • rdfs:label translated from English to Spanish and German using dict.cc MATOLL[1] Word Embeddings[2] [1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015 [2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
  • 48. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 48 DBpedia Labels • rdfs:label translated from English to Spanish and German using dict.cc MATOLL[1] • English, German and Spanish lexica Word Embeddings[2] [1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015 [2] : Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems.(2013)
  • 49. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 49 DBpedia Labels • rdfs:label translated from English to Spanish and German using dict.cc MATOLL[1] • English, German and Spanish lexica Word Embeddings[2] • Skip-gram model with 100 dimensions trained on Wikipedia text (for 3 languages) • Cosine similarity between rdfs:label and mention text [1] : Walter, S., Unger, C., Cimiano, P.: Dblexipedia: A nucleus for a multilingual lexical semanticweb. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located withthe ISWC 2015
  • 50. ISWC 2017 Vienna, Austria Addressing The Lexical Gap 50 Word Embeddings • Skip-gram model with 100 dimensions trained on Wikipedia text (for 3 languages) • Cosine similarity between rdfs:label and mention text
  • 51. ISWC 2017 Vienna, Austria Evaluation 51 • Lexicon • QA
  • 52. ISWC 2017 Vienna, Austria Evaluation -Lexicon - Gold standard : manually annotated lexicon (only DBpedia Ontology properties and classes) 52
  • 53. ISWC 2017 Vienna, Austria Evaluation -Lexicon - Gold standard : manually annotated lexicon (only DBpedia Ontology properties and classes) 53 English German Spanish
  • 54. ISWC 2017 Vienna, Austria Evaluation -Question Answering - Model trained and tested on QALD-6 - Evaluated L2KB and QC separately 54
  • 55. ISWC 2017 Vienna, Austria Evaluation -Question Answering 55DBlex : MATOLL lexica, Dict: Manually created lexica
  • 56. ISWC 2017 Vienna, Austria Evaluation -Question Answering 56DBlex : MATOLL lexica, Dict: Manually created lexica
  • 57. ISWC 2017 Vienna, Austria Evaluation -Question Answering 57DBlex : MATOLL lexica, Dict: Manually created lexica
  • 58. ISWC 2017 Vienna, Austria System Errors 58 • Property (%48) Who wrote the song Hotel California? - dbo:musicalArtist for song instead of the dbo:writer • Resource (%30) When did the Boston Tea Party take place? - The resource wasn’t found • Query Type (%12) Where does Piccadilly start? - wrongly infers that this is an ASK-query • Slot (%10) How many people live in Poland? - Poland is inferred to fill the 2nd slot instead of the 1st slot of dbo:populationTotal
  • 59. ISWC 2017 Vienna, Austria Conclusion - Multilingual Semantic Parsing approach based on factor graphs - Model generalises well even trained with only 161 instances - Language independent 59
  • 60. ISWC 2017 Vienna, Austria Future directions - Improve results by adding additional inference layers e.g. Query Type Classification - Apply different ranking functions e.g. Regression, Pair-wise State comparison - Add more lexical knowledge from other sources - Paraphrasing the training questions to learn from more data 60
  • 61. ISWC 2017 Vienna, Austria Thanks! 61 @sherzodhakimov shakimov AT techfak.uni-bielefeld.de
  • 62. ISWC 2017 Vienna, Austria Approach 62 Observed Variables : Dependency parse tree nodes, relations Hidden Variables : Assigned KB IDs, DUDES, Slots
  • 63. Factor Graphs Formal Definition: ! A factor graph G consists of variables V and factors Ψ . Variables can be subdivided into observed variables xi and hidden variables yi. ! A factor connects subsets of observed and hidden variables and computes a scalar score based on a feature vector fi(xi,yi) and a set of parameters θi: ● The probability for y for a given input x can be calculated as: [1] Hakimov, S., ter Horst, H., Jebbara, S., Hartung, M., Cimiano, P.: Combining Textual and Graph-based Features for Named Entity Disambiguation Using Undirected Probabilistic Graphical Models. [2] Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor Graphs and Sum Product Algorithm. [3] ter Horst, H., Hartung, M. and Cimiano, P. :Joint Entity Recognition and Linking in Technical Domains Using Undirected Probabilistic Graphical Models. LKD 2017 63 What is the goal ? Find y that maximizes the posterior distribution p(y|x; θ)