1. Expressive Query Answering For Semantic Wikis Jie Bao, Rensselaer Polytechnic Institute baojie@cs.rpi.edu, http://www.cs.rpi.edu/~baojie
2. Outline Background: Semantic MediaWiki General Design Issues: Semantics and Expressivity Formalizing SMW with Datalog Extending SMW Modeling and Query Languages Implementations and Experimental Results Jan 18, 2011 2
4. Semantic MediaWiki (SMW) 4 It is the most popular semantic wiki system extending MediaWiki Mediawiki: What you edit what you see Jan 18, 2011
5. Semantic MediaWiki 5 To author knowledge typed link (property) SMW: What you edit (Modeling Language) what you see Jan 18, 2011
6. Semantic MediaWiki 6 To retrieve knowledge SMW: What you edit (Querying Language) what you see Jan 18, 2011
7. Why SMW? Low-cost solution for light-weight semantic applications Integrated environment for modeling and querying Simple to setup, easy to use Can work with hundreds of other MW/SMW extensions Templating, Visualization, Editing, I/O, Workflow… Access Control, Forms, Maps, SPARQL… Jan 18, 2011 7
8. Expressivity (SMW 1.5.4) SMW-ML (Modeling Language) category instantiation e.g., [[Category:C]] property instantiation e.g., [[P::v]] subclass, e.g., [[Category:C]] (on a category page) subproperty, e.g., [[Subpropetyof:Property:P]] (on a property page) SMW-QL (Query Language) conjunction: e.g., [[Category:C]][[P::v]] disjunction: e.g., [[Category:C]] or [[P::v]], [[A||B]] or [[P::v||w]] property chain: e.g., [[P.Q::v]] property wildcat: e.g., [[P::+]] subquery: e.g., [[P::<q>[[Category:C]]</q>]] inverse property e.g., [[-P::v]] value comparison, e.g. [[P::>3]][[P::<7]][[P::!5]] Jan 18, 2011 8
9. However, we often need more expressivity Modeling Domain and Range: “has author” is from “person” to “document” Inverse property: “has author” <-> “author of” Transitive property: “part of” … Query Negation: find cities that are not capitals Counting: find professors who advise more than 5 students Jan 18, 2011 9
10. Extending SMW Goal: offer additional expressivity without losing “wikiness” (i.e., collaborative, simple, easy to learn, informality-tolerate, and evolving-capable ) Jan 18, 2011 10
12. Design Issue 1: Open or Close world? OWL/DL -Like DB/Rule-Like Jan 18, 2011 12 or
13. Design Issue 2: Expressivity Supported A subset of OWL that Can be implemented using rules Is syntactically simple for common wiki users Why not full OWL 2 RL or OWL 2 QL? Too complicated for most wiki users Jan 18, 2011 13
14. Design Issue 3: Implementation Reuse existing tools if we can Low learning curve: hide details from users; incremental changes from SMW Portability: allow users to choose different backend stores (MySQL, SQL Server, etc.) Fast enough for a typical semantic wiki (has < O(104) pages [1]) Jan 18, 2011 14 [1] http://semantic-mediawiki.org/wiki/Sites_using_Semantic_MediaWiki
15. Solution Formalizing SMW modeling and query languages using datalog Descriptive, closed-world semantics Well-understood complexity and many known optimizations Implementation: leverage highly-optimized LP solvers for reasoning, e.g., DLV, Clasp, and Smodels Reuse SMW UI for rendering query results Jan 18, 2011 15
16. Expressivity Modeling Language: a subset of OWL Prime (or RDFS++ named by others) rdfs:subClassOf, subPropertyOf, domain, range owl:TransitiveProperty, SymmetricProperty, FunctionalProperty, InverseFunctionalProperty, inverseOf owl:sameAs, equivalentClass, equivalentProperty Query Language: SMW-QL, plus Negation as failure Cardinality Jan 18, 2011 16
27. Implementation Using DLV as the reasoner Other LP solvers may be used as well Two work modes File-based: reasoning based on a static dump (snapshot) of wiki semantic data. Database-based: reasoning based on a shadow database via ODBC; Real-time changes of instance data will be updated. Optimization Caching Jan 18, 2011 27
29. Scalability: Data Complexity Test machine: 2 * Xeon 5365 Quad 3.0GHz 1333MHz /16G / 2 * 1TB Dataset: part of DBLP, 10,396 pages, 100,736 triples Jan 18, 2011 29 {{#askplus: [[Category:Person]] }} Near linear
30.
31. Conclusions and Future Work Formalizing SMW using datalog allows us to analyze the reasoning complexity of SMW extend SMW modeling and query languages for an expressive subset of OWL implement a SMW query engine based on DLV that is scalable for typical uses. Future Work Incremental reasoning Customized reasoning rules SPARQL <-> SMW-QL+ translations Jan 18, 2011 33