SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Towards Constraint-based
Explanations for Answers and
Non-Answers
Boris Glavic
Illinois Institute of
Technology
Sean Riddle
Athenahealth
Corporation
Sven Köhler
University of California
Davis
Bertram Ludäscher
University of Illinois
Urbana-Champaign
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Overview
•  Introduce a unified framework for generalizing
explanations for answers and non-answers
•  Why/why-not question Q(t)
•  Why is tuple t not in result of query Q?
•  Explanation
•  Provenance for the answer/non-answer
•  Generalization
•  Use an ontology to summarize and generalize
explanations
•  Computing generalized explanations for UCQs
•  Use Datalog
1
Train-Example
2
•  2hop(X,Y)	
  :-­‐	
  Train(X,Z),	
  Train(Z,Y).	
  
•  Why can’t I reach Berlin from Chicago?
•  Why-not 2hop(Chicago,Berlin)	
  
From	
   To	
  
New	
  York	
   Washington	
  DC	
  
Washington	
  DC	
   New	
  	
  York	
  
New	
  York	
   Chicago	
  
Chicago	
   New	
  York	
  
…	
   …	
  
Berlin	
   Munich	
  
Munich	
   Berlin	
  
…	
   …	
  
Sea:le	
  
Chicago	
  
Washington	
  DC	
  
New	
  York	
  
Paris	
  
Berlin	
  
Munich	
  
Atlan=c	
  Ocean!	
  
Train-Example Explanations
•  2hop(X,Y)	
  :-­‐	
  Train(X,Z),	
  Train(Z,Y).	
  
•  Missing train connections explain why Chicago
and Berlin are not connected
•  E.g., if there only would exist a train line between
New York and Berlin: Train(New	
  York,	
  Berlin)!
3
Sea:le	
  
Chicago	
  
Washington	
  DC	
  
New	
  York	
  
Paris	
  
Berlin	
  
Munich	
  
Atlan=c	
  Ocean!	
  
Why-not Approaches
•  Two categories of data-based explanations for
missing answers
•  1) Enumerate all failed rule derivations and
why they failed (missing tuples)
•  Provenance games
•  2) One set of missing tuples that fulfills
optimality criterion
•  e.g., minimal side-effect on query result
•  e.g., Artemis, …
4
Why-not Approaches
•  1) Enumerate all failed rule derivations and
why they failed (missing tuples)
•  Exhaustive explanation
•  Potentially very large explanations
•  Train(Chicago,Munich),	
  Train(Munich,Berlin)	
  
•  Train(Chicago,Seattle),	
  Train(Seattle,Berlin)	
  
•  …
•  2) One set of missing tuples that fulfills optimality
criterion
•  Concise explanation that is optimal in a sense
•  Optimality criterion not always good fit/effective
•  Consider reach (transitive closure)
•  Adding any train connection between USA and Europe
- same effect on query result5
Uniform Treatment of Why/
Why-not
•  Provenance and missing answer approaches
have been treated mostly independently
•  Observation:
•  For provenance models that support query
languages with “full” negation
•  Why and why-not are both provenance
computations!
•  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
•  Why-not Q(New	
  York)?
•  Equivalent to why Q’(New	
  York)?
•  Q’(X)	
  :-­‐	
  adom(X),	
  not	
  Q(X)	
  
6
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Unary Train-Example
•  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
•  Why-not Q(berlin)	
  
•  Explanation: Train(chicago,berlin)	
  
•  Consider an available ontology!
•  More general: Train(chicago,GermanCity)	
  
7
Sea:le	
  
Chicago	
  
Washington	
  DC	
  
New	
  York	
  
Paris	
  
Berlin	
  
Munich	
  
Atlan=c	
  Ocean!	
  
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Unary Train-Example
•  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
•  Why-not Q(berlin)	
  
•  Explanation: Train(chicago,berlin)	
  
•  Consider an available ontology!
•  Generalized explanation:
•  Train(chicago,GermanCity)	
  
•  Most general explanation:
•  Train(chicago,EuropeanCity)	
  
8
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Our Approach
•  Explanations for why/why-not questions
•  over UCQ queries
•  Successful/failed rule derivations
•  Utilize available ontology
•  Expressed as inclusion dependencies
•  “mapped” to instance
•  E.g., city(name,country)	
  
•  GermanCity(X)	
  :-­‐	
  city(X,germany).	
  
•  Generalized explanations
•  Use concepts to describe subsets of an explanation
•  Most general explanation
•  Pareto-optimal
9
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Related Work - Generalization
•  ten	
  Cate	
  et	
  al.	
  High-­‐Level	
  Why-­‐Not	
  
Explana9ons	
  using	
  Ontologies	
  [PODS	
  ‘15]	
  
•  Also uses ontologies for generalization
•  We summarize provenance instead of query results!
•  Only for why-not, but, extension to why trivial
•  Other summarization techniques using
ontologies
•  Data X-ray
•  Datalog-S (datalog with subsumption)
10
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Rule derivations
11
•  What	
  causes	
  a	
  tuple	
  to	
  be	
  or	
  not	
  be	
  in	
  the	
  
result	
  of	
  a	
  query	
  Q?	
  
•  Tuple	
  in	
  result	
  –	
  exists	
  >=	
  1	
  successful	
  rule	
  
deriva=on	
  which	
  jus=fies	
  its	
  existence	
  
•  Existen=al	
  check	
  
•  Tuple	
  not	
  in	
  result	
  -­‐	
  all	
  rule	
  deriva=ons	
  that	
  
would	
  jus=fy	
  its	
  existence	
  have	
  failed	
  
•  Universal	
  check	
  
•  Rule	
  deriva=on	
  
•  Replace	
  rule	
  variables	
  with	
  constants	
  from	
  
instance	
  
•  Successful:	
  body	
  if	
  fulfilled	
  
Basic Explanations
12
•  A	
  basic	
  explana=on	
  for	
  ques=on	
  Q(t)	
  
•  Why	
  -­‐	
  successful	
  deriva=ons	
  with	
  Q(t)	
  as	
  head	
  
•  Why-­‐not	
  -­‐	
  failed	
  rule	
  deriva=ons	
  	
  
•  Replace	
  successful	
  goals	
  with	
  placeholder	
  T	
  
•  Different	
  ways	
  to	
  fail	
  
2hop(Chicago,Munich)	
  :-­‐	
  Train(Chicago,New	
  York),	
  Train(New	
  York,Munich).	
  
2hop(Chicago,Munich)	
  :-­‐	
  Train(Chicago,Berlin),	
  Train(Berlin,Munich).	
  
2hop(Chicago,Munich)	
  :-­‐	
  Train(Chicago,Paris),	
  Train(Paris,Munich).	
  
	
  
	
  
Sea:le	
  
Chicago	
  
Washington	
  DC	
  
New	
  York	
  
Paris	
  
Berlin	
  
Munich	
  
Explanations Example
13
•  Why	
  2hop(Paris,Munich)?	
  
2hop(Paris,Munich)	
  :-­‐	
  Train(Paris,Berlin),	
  
	
   	
   	
   	
  Train(Berlin,Munich).	
  
Sea:le	
  
Chicago	
  
Washington	
  DC	
  
New	
  York	
  
Paris	
  
Berlin	
  
Munich	
  
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Generalized Explanation
14
•  Generalized Explanations
•  Rule derivations with concepts
•  Generalizes user question
•  generalize a head variable
2hop(Chicago,Berlin)	
  –	
  2hop(USCity,EuropeanCity)	
  
•  Summarizes provenance of (non-) answer
•  generalize any rule variable
2hop(New	
  York,Seattle)	
  :-­‐	
  Train(New	
  York,Chicago),	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Train(Chicago,Seattle).	
  
2hop(New	
  York,Seattle)	
  :-­‐	
  Train(New	
  York,USCity),	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Train(USCity,Seattle).	
  
Generalized Explanation Def.
14
•  For user question Q(t) and rule r	
  
•  r(C1,…,Cn)	
  
①  (C1,…,Cn) subsumes user question
②  headvars(C1,…,Cn) only cover existing/
missing tuples
③  For every tuple t’ covered by headvars(C1,
…,Cn) all rule derivations for t’ covered are
explanations for t’
Recap Generalization Example
15
•  r:	
  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
•  Why-not Q(berlin)	
  
•  Explanation: r(berlin)	
  
•  Generalized explanation:
•  r(GermanCity)	
  
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Most General Explanation
16
•  Domination Relationship
•  r(C1,…,Cn)	
  dominates r(D1,…,Dn)	
  
•  if for all i: Ci subsumes Di	
  
•  and exists i: Ci strictly subsumes Di	
  
•  Most General Explanation
•  Not dominated by any other explanation
•  Example most general explanation:
•  r(EuropeanCity)	
  
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Datalog Implementation
① Rules	
  for	
  checking	
  subsump=on	
  and	
  
domina=on	
  of	
  concept	
  tuples	
  
② Rules	
  for	
  successful	
  and	
  failed	
  rule	
  deriva=ons	
  
•  Return	
  variable	
  bindings	
  
③ Rules	
  that	
  model	
  explana=ons,	
  generaliza=on,	
  
and	
  most	
  general	
  explana=ons	
  
17
①  Modeling Subsumption
•  Basic	
  concepts	
  and	
  concepts	
  
isBasicConcept(X)	
  :-­‐	
  Train(X,Y).	
  
isConcept(X)	
  :-­‐	
  isBasicConcept(X).	
  
isConcept(EuropeanCity).	
  
•  Subsump9on	
  (inclusion	
  dependencies)	
  
subsumes(GermanCity,EuropeanCity).	
  
subsumes(X,GermanCity)	
  :-­‐	
  city(X,germany).	
  
•  Transi9ve	
  closure	
  
subsumes(X,Y)	
  :-­‐	
  subsumes(X,Z),	
  subsumes(Z,Y).	
  
•  Non-­‐strict	
  version	
  
subsumesEqual(X,X)	
  :-­‐	
  isConcept(X).	
  
subsumesEqual(X,Y)	
  :-­‐	
  subsumes(X,Y).	
  
18
②  Capture Rule Derivations
•  Rule	
  r1:2hop(X,Y)	
  :-­‐	
  Train(X,Z),	
  Train(Z,Y).	
  
•  Success	
  and	
  failure	
  rules	
  
r1_success(X,Y,Z)	
  :-­‐	
  Train(X,Z),	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Train(Z,Y).	
  
r1_fail(X,Y,Z)	
  :-­‐	
  isBasicConcept(X),	
  	
  
	
   	
  isBasicConcept(Y),	
  	
  
	
   	
  isBasicConcept(Z),	
  	
  
	
   	
  not	
  r1_success(X,Y,Z).	
  
	
  
More	
  general:	
  	
  
r1(X,Y,Z,true,false)	
  :-­‐	
  isBasicConcept(Y),	
  	
  
	
   	
  Train(X,Z),	
  not	
  Train(Z,Y).	
  
	
  
	
  
19
③  Model Generalization
•  Explana9on	
  for	
  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
expl_r1_success(C1,B1)	
  :−	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  subsumesEqual(B1,C1),	
  	
  
	
   	
   	
  r1_success(B1),	
  
	
   	
   	
  not	
  has_r1_fail(C1).	
  
	
  
User	
  ques=on:	
  Q(B1)	
  
Explanation: Q(C1)	
  :-­‐	
  Train(chicago,	
  C1).	
  
	
  
Q(B1)	
  exists	
  and	
  jus=fied	
  by	
  r1:	
  r1_success(B1)	
  
r1	
  succeeds	
  for	
  all	
  B	
  in	
  C1:	
  not	
  has_r1_fail(C1)	
  
20
③  Model Generalization
•  Explana9on	
  for	
  Q(X)	
  :-­‐	
  Train(chicago,X).	
  
expl_r1_success(C1,B1)	
  :−	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  subsumesEqual(B1,C1),	
  	
  
	
   	
   	
  r1_success(B1),	
  
	
   	
   	
  not	
  has_r1_fail(C1).	
  
21
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
③  Model Generalization
•  Domina9on	
  
dominated_r1_success(C1,B1)	
  :-­‐	
  
	
  expl_r1_success(C1,B1),	
  	
  
	
  expl_r1_success(D1,B1),	
  
	
  subsumes(C1,	
  D1).	
  
•  Most	
  general	
  explana9on	
  
most_gen_r1_success(C1,B1)	
  :-­‐	
  
	
  expl_r1_success(C1,B1),	
  	
  
	
  not	
  dominated_r1_success(C1,B1).	
  
•  Why	
  ques9on	
  
why(C1)	
  :-­‐	
  most_gen_r1_success(C1,seattle).	
  
22
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Outline
①  Introduction
②  Approach
③  Explanations
④  Generalized Explanations
⑤  Computing Explanations with Datalog
⑥  Conclusions and Future Work
Conclusions
•  Unified framework for generalizing
provenance-based explanations for why and
why-not questions
•  Uses ontology expressed as inclusion
dependencies (Datalog rules) for summarizing
explanations
•  Uses Datalog to find most general
explanations (pareto optimal)
23
Future Work I
•  Extend ideas to other types of constraints
•  E.g., denial constraints
– German cities have less than 10M inhabitants
:-­‐	
  city(X,germany,Z),	
  Z	
  >	
  10,000,000	
  
•  Query returns countries with very large cities
Q(Y)	
  :-­‐	
  city(X,Y,Z),	
  Z	
  >	
  15,000,000	
  
•  Why-not Q(germany)?
– Constraint describes set of (missing) data
– Can be answered without looking at data
•  Semantic query optimization?
24
Future Work II
•  Alternative definitions of explanation or
generalization
– Our gen. explanations are sound,
but not complete
– Complete version
Concept covers at least explanation
– Sound and complete version:
Concepts cover explanation exactly
•  Queries as ontology concepts
– As introduced in ten Cate
25
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
ACity
NACity
EuropeanCityUSCity
IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity
chicago seattle newyork washington_dc berlin munich paris lyon dijon
Future Work III
•  Extension for FO queries
– Generalization of provenance game graphs
– Need to generalize interactions of rules
•  Implementation
– Integrate with our provenance game
engine
•  Powered by GProM!
•  Negation - not yet
•  Generalization rules - not yet
26
GProM
ParserParser
Query Log
-- --- ---
-- -- --- -- -- - - -
-----
--- --- - ---
Query Log
-- --- ---
-- -- --- -- -- - - -
-----
--- --- - ---
Datalog
Parser
SELECT *
FROM ...
Q(X) :- R(X,Y).
Why(Q(1)).
Provenance
Game
Rewriter
SQL Code
GeneratorSQL Code
GeneratorSQL Code
Generator
User
Backend Database
Datalog
Translator
Q(X) :- R(X,Y).
Why(Q(1)).
move((((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || V0) || ')'),(((((('REL_' || 'R_WON') || '(') || 1) || ',') || V0) || ')')) :- RR_WON_+(1,V0).
move((((((('REL_' || 'R_WON') || '(') || 1) || ',') || V0) || ')'),(((((('EDB_' || 'R_LOST') || '(') || 1) || ',') || V0) || ')')) :- RR_WON_+(1,V0).
move((((('REL_' || 'Q_WON') || '(') || 1) || ')'),(((((('RULE_' || '0_LOST') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y).
move((((((('RULE_' || '0_LOST') || '(') || 1) || ',') || Y) || ')'),(((((('GOAL_' || '0_0_WON') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y).
move((((((('GOAL_' || '0_0_WON') || '(') || 1) || ',') || Y) || ')'),(((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y).
move((((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || Y) || ')'),(((((('REL_' || 'R_WON') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y).
r0_WON_+(1,Y) :- r0_WON_+_nonlinked(1,Y).
RR_WON_+_nonlinked(1,V0) :- R(1,V0).
RQ_WON_+_nonlinked(1) :- r0_WON_+_nonlinked(1,Y).
RR_WON_+(1,V1) :- +r0_WON_+(1,V1),RR_WON_+_nonlinked(1,V1).
r0_WON_+_nonlinked(1,Y) :- +RR_WON_+_nonlinked(1,Y).
Questions?
•  Boris
– http://cs.iit.edu/~dbgroup/index.html
•  Bertram
– https://www.lis.illinois.edu/people/faculty/
ludaesch
Relationship to (Constraint)
Provenance Games
36
¬Train(Chicago, Munich)
g1
7(Chicago, Berlin)
Train(Chicago, Munich) Train(NewY ork, Berlin)
r7(Chicago, WashingtonDC, WashingtonDC, Berlin)
g2
7(Chicago, Berlin) g1
7(Chicago, Chicago)
r7(Chicago, Munich, Munich, Berlin)r7(Chicago, Berlin, Berlin, Berlin)
g2
7(NewY ork, Berlin)
Train(Berlin, Berlin)
r7(Chicago, NewY ork, NewY ork, Berlin)
¬Train(NewY ork, Berlin)
g2
7(Berlin, Berlin)
¬Train(Chicago, Berlin)
g2
7(WashingtonDC, Berlin)
¬Train(Chicago, Chicago) ¬Train(WashingtonDC, Berlin)
g1
7(Chicago, Munich)
¬Train(Chicago, WashingtonDC)
Train(Chicago, WashingtonDC)
g1
7(Chicago, WashingtonDC)
TwoHop(Chicago, Berlin) ¬Train(Chicago, WashingtonDC)
Train(WashingtonDC, Berlin)Train(Chicago, Chicago)
r7(Chicago, Chicago, Chicago, Berlin)
¬Train(Berlin, Berlin)
Train(Chicago, Berlin)
9 Berlin 9 Washington DC9 New York9 Chicago 9 Munich
TwoHop :
x1 = CHI,
x2 6= WDC,
x2 6= CHI
Train :
x2 6= WDC,
x2 6= CHI,
x1 = NY C
G1
1 : Train :
y 6= NY C,
x = CHI
R1 :
x = CHI,
y = CHI,
z = NY C
R1 :
x = CHI,
y = BER,
z = MUN
R1 :
y 6= NY C,
x = CHI,
y 6= WDC,
y 6= CHI,
y 6= BER,
z 6= BER
G2
1 : Train :
y 6= NY C,
y 6= WDC,
y 6= CHI,
y 6= BER,
y 6= MUN,
z = BER
G2
1 : Train :
z 6= MUN,
y = BER
Train :
x2 6= NY C,
x1 = WDC
G2
1 : Train :
z 6= NY C,
y = WDC
G2
1 : Train :
z 6= WDC,
z 6= CHI,
y = NY C
Train :
x1 6= NY C,
x1 6= WDC,
x1 6= CHI,
x1 6= BER,
x2 6= BER
R1 :
x = CHI,
y = MUN,
z = BER
R1 :
x = CHI,
z 6= NY C,
y = WDC
¬Train :
x2 6= NY C,
x1 = WDC
R1 :
x = CHI,
z 6= NY C,
y = CHI
¬Train :
x1 6= NY C,
x1 6= WDC,
x1 6= CHI,
x1 6= BER,
x2 6= BER
R1 :
x = CHI,
y = NY C,
z 6= WDC,
z 6= CHI
Train :
x2 6= MUN,
x1 = BER
¬Train :
x2 6= WDC,
x2 6= CHI,
x1 = NY C
Train :
x1 6= NY C,
x1 6= WDC,
x1 6= CHI,
x1 6= BER,
x1 6= MUN,
x2 = BER
¬Train :
x2 6= MUN,
x1 = BER
G2
1 : Train :
y 6= NY C,
y 6= WDC,
y 6= CHI,
y 6= BER,
z 6= BER
¬Train :
x2 6= NY C,
x1 = CHI
G2
1 : Train :
z 6= NY C,
y = CHI
¬Train :
x1 6= NY C,
x1 6= WDC,
x1 6= CHI,
x1 6= BER,
x1 6= MUN,
x2 = BER
R1 :
x = CHI,
y = WDC,
z = NY C
R1 :
x = CHI,
z 6= MUN,
y = BER
Train :
x2 6= NY C,
x1 = CHI
R1 :
y 6= NY C,
x = CHI,
y 6= WDC,
y 6= CHI,
y 6= BER,
y 6= MUN,
z = BER

Más contenido relacionado

Destacado

Administracion del rrhh
Administracion del rrhhAdministracion del rrhh
Administracion del rrhh
sdiaz5
 
Anh trang trinh chieu
Anh trang  trinh chieuAnh trang  trinh chieu
Anh trang trinh chieu
huongvuduy
 

Destacado (15)

R DATA STRUCTURES 1
R DATA STRUCTURES 1R DATA STRUCTURES 1
R DATA STRUCTURES 1
 
Administracion del rrhh
Administracion del rrhhAdministracion del rrhh
Administracion del rrhh
 
TaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance
TaPP 2011 Talk Boris - Reexamining some Holy Grails of ProvenanceTaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance
TaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance
 
Goes to bali #2
Goes to bali #2Goes to bali #2
Goes to bali #2
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
 
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
 
Akm ch 17 obligasi
Akm ch 17 obligasi Akm ch 17 obligasi
Akm ch 17 obligasi
 
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
 
Akm ch 15 saham
Akm ch 15 sahamAkm ch 15 saham
Akm ch 15 saham
 
Proctor & gamble pampers
Proctor & gamble pampersProctor & gamble pampers
Proctor & gamble pampers
 
Anh trang trinh chieu
Anh trang  trinh chieuAnh trang  trinh chieu
Anh trang trinh chieu
 
R DATE TIME and FACTORS
R DATE TIME and FACTORSR DATE TIME and FACTORS
R DATE TIME and FACTORS
 
R DATA STRUCTURES 2
R DATA STRUCTURES 2R DATA STRUCTURES 2
R DATA STRUCTURES 2
 
Akm ch 7 kas
Akm ch 7 kasAkm ch 7 kas
Akm ch 7 kas
 
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
 

Similar a 2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers

Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01
Saad Liaqat
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
Moshe Vardi
 
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
AJHaeusler
 
Computer Organization1CS1400Feng JiangBoolean al.docx
Computer Organization1CS1400Feng JiangBoolean al.docxComputer Organization1CS1400Feng JiangBoolean al.docx
Computer Organization1CS1400Feng JiangBoolean al.docx
ladonnacamplin
 
Lecture 6.2 flow control repetition
Lecture 6.2  flow control repetitionLecture 6.2  flow control repetition
Lecture 6.2 flow control repetition
alvin567
 

Similar a 2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers (20)

Présentation d'un mémoire à la Louvain School of Management
Présentation d'un mémoire à la Louvain School of ManagementPrésentation d'un mémoire à la Louvain School of Management
Présentation d'un mémoire à la Louvain School of Management
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
 
White Box testing by Pankaj Thakur, NITTTR Chandigarh
White Box testing by Pankaj Thakur, NITTTR ChandigarhWhite Box testing by Pankaj Thakur, NITTTR Chandigarh
White Box testing by Pankaj Thakur, NITTTR Chandigarh
 
ch5.ppt
ch5.pptch5.ppt
ch5.ppt
 
Advanced C Language for Engineering
Advanced C Language for EngineeringAdvanced C Language for Engineering
Advanced C Language for Engineering
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01
 
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowPaper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
An Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesAn Introduction to Quantum Programming Languages
An Introduction to Quantum Programming Languages
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
 
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
Multiple Vehicle Motion Planning: An Infinite Diminsion Newton Optimization M...
 
Computer Organization1CS1400Feng JiangBoolean al.docx
Computer Organization1CS1400Feng JiangBoolean al.docxComputer Organization1CS1400Feng JiangBoolean al.docx
Computer Organization1CS1400Feng JiangBoolean al.docx
 
Modeling Style and Delay Model of VHDL By Ap
Modeling Style and Delay Model of VHDL By ApModeling Style and Delay Model of VHDL By Ap
Modeling Style and Delay Model of VHDL By Ap
 
Software Testing (December – 2017) [Revised Syllabus | Question Paper]
Software Testing (December – 2017) [Revised Syllabus | Question Paper]Software Testing (December – 2017) [Revised Syllabus | Question Paper]
Software Testing (December – 2017) [Revised Syllabus | Question Paper]
 
Tutorial: Formal Methods for Hardware Verification - Overview and Application...
Tutorial: Formal Methods for Hardware Verification - Overview and Application...Tutorial: Formal Methods for Hardware Verification - Overview and Application...
Tutorial: Formal Methods for Hardware Verification - Overview and Application...
 
TINET_FRnOG_2008_public
TINET_FRnOG_2008_publicTINET_FRnOG_2008_public
TINET_FRnOG_2008_public
 
Lecture 6.2 flow control repetition
Lecture 6.2  flow control repetitionLecture 6.2  flow control repetition
Lecture 6.2 flow control repetition
 
Dog-legs, Ladders and a CONSORT statement - Alan Girling and Karla Hemming
Dog-legs, Ladders and a CONSORT statement - Alan Girling and Karla HemmingDog-legs, Ladders and a CONSORT statement - Alan Girling and Karla Hemming
Dog-legs, Ladders and a CONSORT statement - Alan Girling and Karla Hemming
 
Discrete mathematics suraj ppt
Discrete mathematics suraj pptDiscrete mathematics suraj ppt
Discrete mathematics suraj ppt
 

Más de Boris Glavic

WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
Boris Glavic
 
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
Boris Glavic
 

Más de Boris Glavic (11)

2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
 
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
 
2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator
 
EDBT 2009 - Provenance for Nested Subqueries
EDBT 2009 - Provenance for Nested SubqueriesEDBT 2009 - Provenance for Nested Subqueries
EDBT 2009 - Provenance for Nested Subqueries
 
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
 
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
 
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
 
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
 
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
 
TaPP 2013 - Provenance for Data Mining
TaPP 2013 - Provenance for Data MiningTaPP 2013 - Provenance for Data Mining
TaPP 2013 - Provenance for Data Mining
 
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
 

Último

CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Último (20)

CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 

2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers

  • 1. Towards Constraint-based Explanations for Answers and Non-Answers Boris Glavic Illinois Institute of Technology Sean Riddle Athenahealth Corporation Sven Köhler University of California Davis Bertram Ludäscher University of Illinois Urbana-Champaign
  • 2. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 3. Overview •  Introduce a unified framework for generalizing explanations for answers and non-answers •  Why/why-not question Q(t) •  Why is tuple t not in result of query Q? •  Explanation •  Provenance for the answer/non-answer •  Generalization •  Use an ontology to summarize and generalize explanations •  Computing generalized explanations for UCQs •  Use Datalog 1
  • 4. Train-Example 2 •  2hop(X,Y)  :-­‐  Train(X,Z),  Train(Z,Y).   •  Why can’t I reach Berlin from Chicago? •  Why-not 2hop(Chicago,Berlin)   From   To   New  York   Washington  DC   Washington  DC   New    York   New  York   Chicago   Chicago   New  York   …   …   Berlin   Munich   Munich   Berlin   …   …   Sea:le   Chicago   Washington  DC   New  York   Paris   Berlin   Munich   Atlan=c  Ocean!  
  • 5. Train-Example Explanations •  2hop(X,Y)  :-­‐  Train(X,Z),  Train(Z,Y).   •  Missing train connections explain why Chicago and Berlin are not connected •  E.g., if there only would exist a train line between New York and Berlin: Train(New  York,  Berlin)! 3 Sea:le   Chicago   Washington  DC   New  York   Paris   Berlin   Munich   Atlan=c  Ocean!  
  • 6. Why-not Approaches •  Two categories of data-based explanations for missing answers •  1) Enumerate all failed rule derivations and why they failed (missing tuples) •  Provenance games •  2) One set of missing tuples that fulfills optimality criterion •  e.g., minimal side-effect on query result •  e.g., Artemis, … 4
  • 7. Why-not Approaches •  1) Enumerate all failed rule derivations and why they failed (missing tuples) •  Exhaustive explanation •  Potentially very large explanations •  Train(Chicago,Munich),  Train(Munich,Berlin)   •  Train(Chicago,Seattle),  Train(Seattle,Berlin)   •  … •  2) One set of missing tuples that fulfills optimality criterion •  Concise explanation that is optimal in a sense •  Optimality criterion not always good fit/effective •  Consider reach (transitive closure) •  Adding any train connection between USA and Europe - same effect on query result5
  • 8. Uniform Treatment of Why/ Why-not •  Provenance and missing answer approaches have been treated mostly independently •  Observation: •  For provenance models that support query languages with “full” negation •  Why and why-not are both provenance computations! •  Q(X)  :-­‐  Train(chicago,X).   •  Why-not Q(New  York)? •  Equivalent to why Q’(New  York)? •  Q’(X)  :-­‐  adom(X),  not  Q(X)   6
  • 9. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 10. Unary Train-Example •  Q(X)  :-­‐  Train(chicago,X).   •  Why-not Q(berlin)   •  Explanation: Train(chicago,berlin)   •  Consider an available ontology! •  More general: Train(chicago,GermanCity)   7 Sea:le   Chicago   Washington  DC   New  York   Paris   Berlin   Munich   Atlan=c  Ocean!   ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 11. Unary Train-Example •  Q(X)  :-­‐  Train(chicago,X).   •  Why-not Q(berlin)   •  Explanation: Train(chicago,berlin)   •  Consider an available ontology! •  Generalized explanation: •  Train(chicago,GermanCity)   •  Most general explanation: •  Train(chicago,EuropeanCity)   8 ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 12. Our Approach •  Explanations for why/why-not questions •  over UCQ queries •  Successful/failed rule derivations •  Utilize available ontology •  Expressed as inclusion dependencies •  “mapped” to instance •  E.g., city(name,country)   •  GermanCity(X)  :-­‐  city(X,germany).   •  Generalized explanations •  Use concepts to describe subsets of an explanation •  Most general explanation •  Pareto-optimal 9 ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 13. Related Work - Generalization •  ten  Cate  et  al.  High-­‐Level  Why-­‐Not   Explana9ons  using  Ontologies  [PODS  ‘15]   •  Also uses ontologies for generalization •  We summarize provenance instead of query results! •  Only for why-not, but, extension to why trivial •  Other summarization techniques using ontologies •  Data X-ray •  Datalog-S (datalog with subsumption) 10
  • 14. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 15. Rule derivations 11 •  What  causes  a  tuple  to  be  or  not  be  in  the   result  of  a  query  Q?   •  Tuple  in  result  –  exists  >=  1  successful  rule   deriva=on  which  jus=fies  its  existence   •  Existen=al  check   •  Tuple  not  in  result  -­‐  all  rule  deriva=ons  that   would  jus=fy  its  existence  have  failed   •  Universal  check   •  Rule  deriva=on   •  Replace  rule  variables  with  constants  from   instance   •  Successful:  body  if  fulfilled  
  • 16. Basic Explanations 12 •  A  basic  explana=on  for  ques=on  Q(t)   •  Why  -­‐  successful  deriva=ons  with  Q(t)  as  head   •  Why-­‐not  -­‐  failed  rule  deriva=ons     •  Replace  successful  goals  with  placeholder  T   •  Different  ways  to  fail   2hop(Chicago,Munich)  :-­‐  Train(Chicago,New  York),  Train(New  York,Munich).   2hop(Chicago,Munich)  :-­‐  Train(Chicago,Berlin),  Train(Berlin,Munich).   2hop(Chicago,Munich)  :-­‐  Train(Chicago,Paris),  Train(Paris,Munich).       Sea:le   Chicago   Washington  DC   New  York   Paris   Berlin   Munich  
  • 17. Explanations Example 13 •  Why  2hop(Paris,Munich)?   2hop(Paris,Munich)  :-­‐  Train(Paris,Berlin),          Train(Berlin,Munich).   Sea:le   Chicago   Washington  DC   New  York   Paris   Berlin   Munich  
  • 18. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 19. Generalized Explanation 14 •  Generalized Explanations •  Rule derivations with concepts •  Generalizes user question •  generalize a head variable 2hop(Chicago,Berlin)  –  2hop(USCity,EuropeanCity)   •  Summarizes provenance of (non-) answer •  generalize any rule variable 2hop(New  York,Seattle)  :-­‐  Train(New  York,Chicago),                                                      Train(Chicago,Seattle).   2hop(New  York,Seattle)  :-­‐  Train(New  York,USCity),                                                      Train(USCity,Seattle).  
  • 20. Generalized Explanation Def. 14 •  For user question Q(t) and rule r   •  r(C1,…,Cn)   ①  (C1,…,Cn) subsumes user question ②  headvars(C1,…,Cn) only cover existing/ missing tuples ③  For every tuple t’ covered by headvars(C1, …,Cn) all rule derivations for t’ covered are explanations for t’
  • 21. Recap Generalization Example 15 •  r:  Q(X)  :-­‐  Train(chicago,X).   •  Why-not Q(berlin)   •  Explanation: r(berlin)   •  Generalized explanation: •  r(GermanCity)   ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 22. Most General Explanation 16 •  Domination Relationship •  r(C1,…,Cn)  dominates r(D1,…,Dn)   •  if for all i: Ci subsumes Di   •  and exists i: Ci strictly subsumes Di   •  Most General Explanation •  Not dominated by any other explanation •  Example most general explanation: •  r(EuropeanCity)   ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 23. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 24. Datalog Implementation ① Rules  for  checking  subsump=on  and   domina=on  of  concept  tuples   ② Rules  for  successful  and  failed  rule  deriva=ons   •  Return  variable  bindings   ③ Rules  that  model  explana=ons,  generaliza=on,   and  most  general  explana=ons   17
  • 25. ①  Modeling Subsumption •  Basic  concepts  and  concepts   isBasicConcept(X)  :-­‐  Train(X,Y).   isConcept(X)  :-­‐  isBasicConcept(X).   isConcept(EuropeanCity).   •  Subsump9on  (inclusion  dependencies)   subsumes(GermanCity,EuropeanCity).   subsumes(X,GermanCity)  :-­‐  city(X,germany).   •  Transi9ve  closure   subsumes(X,Y)  :-­‐  subsumes(X,Z),  subsumes(Z,Y).   •  Non-­‐strict  version   subsumesEqual(X,X)  :-­‐  isConcept(X).   subsumesEqual(X,Y)  :-­‐  subsumes(X,Y).   18
  • 26. ②  Capture Rule Derivations •  Rule  r1:2hop(X,Y)  :-­‐  Train(X,Z),  Train(Z,Y).   •  Success  and  failure  rules   r1_success(X,Y,Z)  :-­‐  Train(X,Z),                                            Train(Z,Y).   r1_fail(X,Y,Z)  :-­‐  isBasicConcept(X),        isBasicConcept(Y),        isBasicConcept(Z),        not  r1_success(X,Y,Z).     More  general:     r1(X,Y,Z,true,false)  :-­‐  isBasicConcept(Y),        Train(X,Z),  not  Train(Z,Y).       19
  • 27. ③  Model Generalization •  Explana9on  for  Q(X)  :-­‐  Train(chicago,X).   expl_r1_success(C1,B1)  :−                              subsumesEqual(B1,C1),          r1_success(B1),        not  has_r1_fail(C1).     User  ques=on:  Q(B1)   Explanation: Q(C1)  :-­‐  Train(chicago,  C1).     Q(B1)  exists  and  jus=fied  by  r1:  r1_success(B1)   r1  succeeds  for  all  B  in  C1:  not  has_r1_fail(C1)   20
  • 28. ③  Model Generalization •  Explana9on  for  Q(X)  :-­‐  Train(chicago,X).   expl_r1_success(C1,B1)  :−                              subsumesEqual(B1,C1),          r1_success(B1),        not  has_r1_fail(C1).   21 ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 29. ③  Model Generalization •  Domina9on   dominated_r1_success(C1,B1)  :-­‐    expl_r1_success(C1,B1),      expl_r1_success(D1,B1),    subsumes(C1,  D1).   •  Most  general  explana9on   most_gen_r1_success(C1,B1)  :-­‐    expl_r1_success(C1,B1),      not  dominated_r1_success(C1,B1).   •  Why  ques9on   why(C1)  :-­‐  most_gen_r1_success(C1,seattle).   22 ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 30. Outline ①  Introduction ②  Approach ③  Explanations ④  Generalized Explanations ⑤  Computing Explanations with Datalog ⑥  Conclusions and Future Work
  • 31. Conclusions •  Unified framework for generalizing provenance-based explanations for why and why-not questions •  Uses ontology expressed as inclusion dependencies (Datalog rules) for summarizing explanations •  Uses Datalog to find most general explanations (pareto optimal) 23
  • 32. Future Work I •  Extend ideas to other types of constraints •  E.g., denial constraints – German cities have less than 10M inhabitants :-­‐  city(X,germany,Z),  Z  >  10,000,000   •  Query returns countries with very large cities Q(Y)  :-­‐  city(X,Y,Z),  Z  >  15,000,000   •  Why-not Q(germany)? – Constraint describes set of (missing) data – Can be answered without looking at data •  Semantic query optimization? 24
  • 33. Future Work II •  Alternative definitions of explanation or generalization – Our gen. explanations are sound, but not complete – Complete version Concept covers at least explanation – Sound and complete version: Concepts cover explanation exactly •  Queries as ontology concepts – As introduced in ten Cate 25 ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon ACity NACity EuropeanCityUSCity IllinoisCity WashingtonCity NYStateCity DCCity GermanCity FrenchCity chicago seattle newyork washington_dc berlin munich paris lyon dijon
  • 34. Future Work III •  Extension for FO queries – Generalization of provenance game graphs – Need to generalize interactions of rules •  Implementation – Integrate with our provenance game engine •  Powered by GProM! •  Negation - not yet •  Generalization rules - not yet 26 GProM ParserParser Query Log -- --- --- -- -- --- -- -- - - - ----- --- --- - --- Query Log -- --- --- -- -- --- -- -- - - - ----- --- --- - --- Datalog Parser SELECT * FROM ... Q(X) :- R(X,Y). Why(Q(1)). Provenance Game Rewriter SQL Code GeneratorSQL Code GeneratorSQL Code Generator User Backend Database Datalog Translator Q(X) :- R(X,Y). Why(Q(1)). move((((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || V0) || ')'),(((((('REL_' || 'R_WON') || '(') || 1) || ',') || V0) || ')')) :- RR_WON_+(1,V0). move((((((('REL_' || 'R_WON') || '(') || 1) || ',') || V0) || ')'),(((((('EDB_' || 'R_LOST') || '(') || 1) || ',') || V0) || ')')) :- RR_WON_+(1,V0). move((((('REL_' || 'Q_WON') || '(') || 1) || ')'),(((((('RULE_' || '0_LOST') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y). move((((((('RULE_' || '0_LOST') || '(') || 1) || ',') || Y) || ')'),(((((('GOAL_' || '0_0_WON') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y). move((((((('GOAL_' || '0_0_WON') || '(') || 1) || ',') || Y) || ')'),(((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y). move((((((('notREL_' || 'R_LOST') || '(') || 1) || ',') || Y) || ')'),(((((('REL_' || 'R_WON') || '(') || 1) || ',') || Y) || ')')) :- r0_WON_+(1,Y). r0_WON_+(1,Y) :- r0_WON_+_nonlinked(1,Y). RR_WON_+_nonlinked(1,V0) :- R(1,V0). RQ_WON_+_nonlinked(1) :- r0_WON_+_nonlinked(1,Y). RR_WON_+(1,V1) :- +r0_WON_+(1,V1),RR_WON_+_nonlinked(1,V1). r0_WON_+_nonlinked(1,Y) :- +RR_WON_+_nonlinked(1,Y).
  • 36. Relationship to (Constraint) Provenance Games 36 ¬Train(Chicago, Munich) g1 7(Chicago, Berlin) Train(Chicago, Munich) Train(NewY ork, Berlin) r7(Chicago, WashingtonDC, WashingtonDC, Berlin) g2 7(Chicago, Berlin) g1 7(Chicago, Chicago) r7(Chicago, Munich, Munich, Berlin)r7(Chicago, Berlin, Berlin, Berlin) g2 7(NewY ork, Berlin) Train(Berlin, Berlin) r7(Chicago, NewY ork, NewY ork, Berlin) ¬Train(NewY ork, Berlin) g2 7(Berlin, Berlin) ¬Train(Chicago, Berlin) g2 7(WashingtonDC, Berlin) ¬Train(Chicago, Chicago) ¬Train(WashingtonDC, Berlin) g1 7(Chicago, Munich) ¬Train(Chicago, WashingtonDC) Train(Chicago, WashingtonDC) g1 7(Chicago, WashingtonDC) TwoHop(Chicago, Berlin) ¬Train(Chicago, WashingtonDC) Train(WashingtonDC, Berlin)Train(Chicago, Chicago) r7(Chicago, Chicago, Chicago, Berlin) ¬Train(Berlin, Berlin) Train(Chicago, Berlin) 9 Berlin 9 Washington DC9 New York9 Chicago 9 Munich TwoHop : x1 = CHI, x2 6= WDC, x2 6= CHI Train : x2 6= WDC, x2 6= CHI, x1 = NY C G1 1 : Train : y 6= NY C, x = CHI R1 : x = CHI, y = CHI, z = NY C R1 : x = CHI, y = BER, z = MUN R1 : y 6= NY C, x = CHI, y 6= WDC, y 6= CHI, y 6= BER, z 6= BER G2 1 : Train : y 6= NY C, y 6= WDC, y 6= CHI, y 6= BER, y 6= MUN, z = BER G2 1 : Train : z 6= MUN, y = BER Train : x2 6= NY C, x1 = WDC G2 1 : Train : z 6= NY C, y = WDC G2 1 : Train : z 6= WDC, z 6= CHI, y = NY C Train : x1 6= NY C, x1 6= WDC, x1 6= CHI, x1 6= BER, x2 6= BER R1 : x = CHI, y = MUN, z = BER R1 : x = CHI, z 6= NY C, y = WDC ¬Train : x2 6= NY C, x1 = WDC R1 : x = CHI, z 6= NY C, y = CHI ¬Train : x1 6= NY C, x1 6= WDC, x1 6= CHI, x1 6= BER, x2 6= BER R1 : x = CHI, y = NY C, z 6= WDC, z 6= CHI Train : x2 6= MUN, x1 = BER ¬Train : x2 6= WDC, x2 6= CHI, x1 = NY C Train : x1 6= NY C, x1 6= WDC, x1 6= CHI, x1 6= BER, x1 6= MUN, x2 = BER ¬Train : x2 6= MUN, x1 = BER G2 1 : Train : y 6= NY C, y 6= WDC, y 6= CHI, y 6= BER, z 6= BER ¬Train : x2 6= NY C, x1 = CHI G2 1 : Train : z 6= NY C, y = CHI ¬Train : x1 6= NY C, x1 6= WDC, x1 6= CHI, x1 6= BER, x1 6= MUN, x2 = BER R1 : x = CHI, y = WDC, z = NY C R1 : x = CHI, z 6= MUN, y = BER Train : x2 6= NY C, x1 = CHI R1 : y 6= NY C, x = CHI, y 6= WDC, y 6= CHI, y 6= BER, y 6= MUN, z = BER