SlideShare a Scribd company logo
1 of 7
Download to read offline
Enabling Data Mining Systems to
                         Semantic Web Applications
                                                        Francesca A. Lisi
                                   Dipartimento di Informatica, Universit` degli Studi di Bari,
                                                                         a
                                             Via E. Orabona 4, I-70125 Bari, Italy
                                                     Email: lisi@di.uniba.it


   Abstract— Semantic Web Mining can be considered as Data          of frequent pattern discovery [3]. It implements a framework
Mining (DM) for/from the Semantic Web. Current DM systems           for learning Semantic Web rules [4] which adopts AL-log
could serve the purpose of Semantic Web Mining if they were         [5] as the Knowledge Representation and Reasoning (KR&R)
more compliant with, e.g., the standards of representation for
ontologies and rules in the Semantic Web and/or interoperable       setting and Inductive Logic Programming (ILP) [6] as the
with well-established tools for Ontological Engineering (OE) that   methodological apparatus.
support these standards. In this paper we present a middleware,        Semantic Web Mining [7] is a new application area which
SW ING, that integrates the DM system AL-Q U I N and the OE         aims at combining the two areas of Semantic Web [8] and Web
tool Prot´ g´ -2000 in order to enable AL-Q U I N to Semantic Web
         e e                                                        Mining [9] from a twofold perspective. On one hand, the new
applications. This showcase suggests a methodology for building
Semantic Web Mining systems.                                        semantic structures in the Web can be exploited to improve
                                                                    the results of Web Mining. On the other hand, the results of
                       I. I NTRODUCTION                             Web Mining can be used for building the Semantic Web. Most
   Data Mining (DM) is an application area arisen in the 1990s      work in Semantic Web Mining simply extends previous work
at the intersection of several different research fields, notably    to the new application context. E.g., Maedche and Staab [10]
Statistics, Machine Learning and Databases, as soon as devel-       apply a well-known algorithm for association rule mining to
opments in sensing, communications and storage technologies         discover conceptual relations from text. Indeed, we argue that
made it possible to collect and store large collections of scien-   Semantic Web Mining can be considered as DM for/from the
tific and commercial data [1]. The abilities to analyze such data    Semantic Web. Current DM systems could serve the purpose
sets had not developed as fast. Research in DM can be loosely       of Semantic Web Mining if they were more compliant with,
defined as the study of methods, techniques and algorithms for       e.g., the standards of representation for ontologies and rules in
finding models or patterns that are interesting or valuable in       the Semantic Web and/or interoperable with well-established
large data sets. The space of patterns if often infinite, and the    tools for Ontological Engineering (OE) [11], e.g. Prot´ g´ -2000
                                                                                                                           e e
enumeration of patterns involves some form of search in one         [12], that support these standards.
such space. Practical computational constraints place severe           In this paper we present a middleware, SW ING, that inte-
limits on the subspace that can be explored by a data mining        grates AL-Q U I N and Prot´ g´ -2000 in order to enable Seman-
                                                                                                e e
algorithm. The goal of DM is either prediction or description.      tic Web applications of AL-Q U I N. This solution suggests a
Prediction involves using some variables or fields in the            methodology for building Semantic Web Mining systems, i.e.
database to predict unknown or future values of other variables     the upgrade of existing DM systems with facilities provided
of interest. Description focuses on finding human-interpretable      by interoperable OE tools.
patterns describing data. Among descriptive tasks, data sum-           The paper is structured as follows. Section II and III
marization aims at the extraction of compact patterns that          briefly introduce AL-Q U I N and Prot´ g´ -2000 respectively.
                                                                                                              e e
describe subsets of data. There are two classes of methods          Section IV presents the middleware SW ING. Section V draws
which represent taking horizontal (cases) and vertical (fields)      conclusions and outlines directions of future work.
slices of the data. In the former, one would like to produce
summaries of subsets, e.g. producing sufficient statistics or                       II. T HE DM SYSTEM AL-Q U I N
logical conditions that hold for subsets. In the latter case, one      The system AL-Q U I N [2] (a previous version is described
would like to describe relations between fields. This class of       in [13]) supports a variant of the DM task of frequent pattern
methods is distinguished from the above in that rather than         discovery. In DM a pattern is considered as an intensional
predicting the value of a specified field (e.g., classification)       description (expressed in a given language L) of a subset
or grouping cases together (e.g. clustering) the goal is to find     of r. The support of a pattern is the relative frequency of
relations between fields. One common output of this vertical         the pattern within r and is computed with the evaluation
data summarization is called frequent (association) patterns.       function supp. The task of frequent pattern discovery aims at
These patterns state that certain combinations of values occur      the extraction of all frequent patterns, i.e. all patterns whose
in a given database with a support greater than a user-defined       support exceeds a user-defined threshold of minimum support.
threshold. The system AL-Q U I N [2] supports the DM task           The blueprint of most algorithms for frequent pattern discovery
form of DATALOG [16] that is obtained by using ALC concept
                                                                            assertions essentially as type constraints on variables. The
                                                                            portion K of B which encompasses the whole Σ and the
                                                                            intensional part (IDB) of Π is considered as background
                                                                            knowledge. The extensional part of Π is partitioned into
                                                                            portions Ai each of which refers to an individual ai of Cref .
                                                                            The link between Ai and ai is represented with the DATALOG
                                                                            literal q(ai ). The pair (q(ai ), Ai ) is called observation.
                                                                               The language L = {Ll }1≤l≤maxG of patterns allows for
                                                                            the generation of AL-log unary conjunctive queries, called
                                                                            O-queries. Given a reference concept Cref , an O-query Q to
                                                                            an AL-log knowledge base B is a (linked and connected)1
                                                                            constrained DATALOG clause of the form
 Fig. 1.   Organization of the hybrid knowledge bases used in AL-Q U I N.
                                                                                      Q = q(X) ← α1 , . . . , αm &X : Cref , γ1 , . . . , γn
                                                                            where X is the distinguished variable and the remaining
                                                                            variables occurring in the body of Q are the existential
is the levelwise search [3]. It is based on the following                   variables. Note that αj , 1 ≤ j ≤ m, is a DATALOG literal
assumption: If a generality order        for the language L of              whereas γk , 1 ≤ k ≤ n, is an assertion that constrains a
patterns can be found such that       is monotonic w.r.t. supp,             variable already appearing in any of the αj ’s to vary in the
then the resulting space (L, ) can be searched breadth-first                 range of individuals of a concept defined in B. The O-query
starting from the most general pattern in L and by alternating                                      Qt = q(X) ← &X : Cref
candidate generation and candidate evaluation phases. In
particular, candidate generation consists of a refinement step               is called trivial for L because it only contains the constraint
followed by a pruning step. The former derives candidates                   for the distinguished variable X. Furthermore the language
for the current search level from patterns found frequent in                L is multi-grained, i.e. it contains expressions at multiple
the previous search level. The latter allows some infrequent                levels of description granularity. Indeed it is implicitly defined
patterns to be detected and discarded prior to evaluation thanks            by a declarative bias specification which consists of a finite
to the monotonicity of .                                                    alphabet A of DATALOG predicate names and finite alphabets
   The variant of the frequent pattern discovery problem which              Γl (one for each level l of description granularity) of ALC
is solved by AL-Q U I N takes concept hierarchies into account              concept names. Note that the αi ’s are taken from A and γj ’s
during the discovery process [14], thus yielding descriptions               are taken from Γl . We impose L to be finite by specifying
of a data set r at multiple granularity levels up to a maximum              some bounds, mainly maxD for the maximum depth of search
level maxG. More formally, given                                            and maxG for the maximum level of granularity.
                                                                               The support of an O-query Q ∈ Ll w.r.t an AL-log
  •   a data set r including a taxonomy T where a reference
                                                                            knowledge base B is defined as
      concept Cref and task-relevant concepts are designated,
  •   a multi-grained language {Ll }1≤l≤maxG of patterns                      supp(Q, B) =| answerset(Q, B) | / | answerset(Qt , B) |
  •   a set {minsupl }1≤l≤maxG of minimum support thresh-
                                                                            where Qt is the trivial O-query for L. The computation of
      olds
                                                                            support relies on query answering in AL-log. Indeed, an
the problem of frequent pattern discovery at l levels of                    answer to an O-query Q is a ground substitution θ for the
description granularity, 1 ≤ l ≤ maxG, is to find the set                    distinguished variable of Q. An answer θ to an O-query Q is
F of all the patterns P ∈ Ll frequent in r, namely P ’s with                a correct (resp. computed) answer w.r.t. an AL-log knowledge
support s such that (i) s ≥ minsupl and (ii) all ancestors of               base B if there exists at least one correct (resp. computed)
P w.r.t. T are frequent. Note that a pattern Q is considered                answer to body(Q)θ w.r.t. B. Therefore proving that an O-
to be an ancestor of P if it is a coarser-grained version of P .            query Q covers an observation (q(ai ), Ai ) w.r.t. K equals to
   In AL-Q U I N (AL-log Q Uery I Nduction) the data set r is               proving that θi = {X/ai } is a correct answer to Q w.r.t.
represented as an AL-log knowledge base B and structured                    Bi = K ∪ Ai .
as illustrated in Figure 1. The structural subsystem Σ is based                The system AL-Q U I N implements the aforementioned lev-
on ALC [15] and allows for the specification of knowledge in                 elwise search method for frequent pattern discovery. In par-
terms of classes (concepts), binary relations between classes               ticular, candidate patterns of a certain level k (called k-
(roles), and instances (individuals). In particular, the TBox T             patterns) are obtained by refinement of the frequent patterns
contains is-a relations between concepts (axioms) whereas the               discovered at level k − 1. In AL-Q U I N patterns are ordered
ABox M contains instance-of relations between individuals                   according to B-subsumption (which has been proved to fulfill
(resp. couples of individuals) and concepts (resp. roles) (as-
sertions). The relational subsystem Π is based on an extended                 1 For   the definition of linkedness and connectedness see [6].
the abovementioned condition of monotonicity [13]). The
search starts from the most general pattern in L and iterates
through the generation-evaluation cycle for a number of times
that is bounded with respect to both the granularity level l
(maxG) and the depth level k (maxD).
   Since AL-Q U I N is implemented with Prolog, the internal
representation language in AL-Q U I N is a kind of DATALOGOI
[17], i.e. the subset of DATALOG= equipped with an equational
theory that consists of the axioms of Clark’s Equality Theory
augmented with one rewriting rule that adds inequality atoms
s = t to any P ∈ L for each pair (s, t) of distinct terms
occurring in P . Note that concept assertions are rendered as
membership atoms, e.g. a : C becomes c C(a).
                                       ´ ´
               III. T HE OE TOOL P ROT E G E -2000
                                                                                      Fig. 2.   Architecture of the OWL Plugin for Prot´ g´ -2000.
                                                                                                                                       e e
   Prot´ g´ -20002 [18] is the latest version of the Prot´ g´ line
       e e                                                e e
of tools, created by the Stanford Medical Informatics (SMI)
group at Stanford University, USA. It has a community of
thousands of users. Although the development of Prot´ g´ has
                                                          e e
historically been mainly driven by biomedical applications, the                Prot´ g´ -2000’s form editor, where users can select alternative
                                                                                    e e
system is domain-independent and has been successfully used                    user interface widgets for their project. The user interface
for many other application areas as well. Prot´ g´ -2000 is a
                                                    e e                        consists of panels (tabs) for editing classes, properties, forms
Java-based standalone application to be installed and run in a                 and instances.
local computer. The core of this application is the ontology                      Prot´ g´ -2000 has an extensible architecture, i.e. an architec-
                                                                                      e e
editor. Like most other modeling tools, the architecture of                    ture that allows special-purpose extensions (aka plug-ins) to be
Prot´ g´ -2000 is cleanly separated into a model part and a
     e e                                                                       easily integrated. These extensions usually perform functions
view part. Prot´ g´ -2000’s model is the internal representation
                 e e                                                           not provided by the Prot´ g´ -2000 standard distribution (other
                                                                                                         e e
mechanism for ontologies and knowledge bases. Prot´ g´ -      e e              types of visualization, new import and export formats, etc.),
2000’s view components provide a Graphical User Interface                      implement applications that use Prot´ g´ -2000 ontologies, or
                                                                                                                       e e
(GUI) to display and manipulate the underlying model.                          allow configuring the ontology editor. Most of these plug-
   Prot´ g´ -2000’s model is based on a simple yet flexible
       e e                                                                     ins are available in the Prot´ g´ -2000 Plug-in Library, where
                                                                                                             e e
metamodel [12], which is comparable to object-oriented and                     contributions from many different research groups can be
frame-based systems. It basically can represent ontologies                     found. One of the most popular in this library is the OWL
consisting of classes, properties (slots), property characteristics            Plugin [19].
(facets and constraints), and instances. Prot´ g´ -2000 provides
                                               e e                                As illustrated in Figure 2, the OWL Plugin extends the
an open Java API to query and manipulate models. An                            Prot´ g´ -2000 model and its API with classes to represent the
                                                                                    e e
important strength of Prot´ g´ -2000 is that the Prot´ g´ -2000
                             e e                         e e                   OWL3 specification. In particular it supports RDF(S), OWL
metamodel itself is a Prot´ g´ -2000 ontology, with classes that
                            e e                                                Lite, OWL DL (except for anonymous global class axioms,
represent classes, properties, and so on. For example, the de-                 which need to be given a name by the user) and significant
fault class in the Protege base system is called :STANDARD-                    parts of OWL Full (including metaclasses). The OWL API
CLASS, and has properties such as :NAME and :DIRECT-                           basically encapsulates the internal mapping and thus shields
SUPERCLASSES. This structure of the metamodel enables                          the user from error-prone low-level access. Furthermore the
easy extension and adaption to other representations.                          OWL Plugin provides a comprehensive mapping between its
   Using the views of Prot´ g´ -2000’s GUI, ontology designers
                            e e                                                extended API and the standard OWL parsing library Jena4 . The
basically create classes, assign properties to the classes, and                presence of a secondary representation of an OWL ontology
then restrict the properties facets at certain classes. Using                  in terms of Jena objects means that the user is able to invoke
the resulting ontologies, Prot´ g´ -2000 is able to automatically
                               e e                                             arbitrary Jena-based services such as interfaces to classifiers,
generate user interfaces that support the creation of individuals              query languages, or visualization tools permanently. Based on
(instances). For each class in the ontology, the system creates                the above mentioned metamodel and API extensions, the OWL
one form with editing components (widgets) for each property                   Plugin provides several custom-tailored GUI components for
of the class. For example, for properties that can take single                 OWL. Also it can directly access DL reasoners such as
string values, the system would by default provide a text field                 RACER [20]. Finally it can be further extended, e.g. to support
widget. The generated forms can be further customized with                     OWL-based languages like SWRL5 .

  2 The distribution of interest to this work is 3.0 (February 2005), freely     3 http://www.w3.org/2004/OWL/
                                                                                 4 http://jena.sourceforge.net
available at http://protege.stanford.edu/ under the Mozilla open-
source license.                                                                  5 http://www.w3.org/Submission/SWRL/
the parameter settings. One of these findings is the pattern
                                                                   Q2 which turns out to be frequent because it has support
                                                                   supp(Q2 , BCIA ) = 13% (≥ minsup2 ). This has to be read
                                                                   as ’13 % of Middle East countries speak an Indoeuropean
                                                                   language’.                                                ♦




              Fig. 3.   Architecture and I/O of SW ING.




       IV. E NABLING AL-Q U I N TO S EMANTIC W EB
                                     ´ ´
           APPLICATIONS WITH P ROT E G E -2000

   To enable AL-Q U I N to Semantic Web applications we have
developed a software component, SW ING, that assists users
of AL-Q U I N in the design of Semantic Web Mining sessions.
As illustrated in Figure 3, SW ING is a middleware because it
interoperates via API with the OWL Plugin for Prot´ g´ -2000
                                                    e e
to benefit from its facilities for browsing and reasoning on
OWL ontologies.                                                                 Fig. 4.   SW ING: step of concept selection.

Example IV.1. The screenshots reported in Figure 4, 5, 6
and 7 refer to a Semantic Web Mining session with SW ING
for the task of finding frequent patterns in the on-line CIA
World Fact Book6 (data set) that describe Middle East coun-
tries (reference concept) w.r.t. the religions believed and the
languages spoken (task-relevant concepts) at three levels of
granularity (maxG = 3). To this aim we define LCIA as the set
of O-queries with Cref = MiddleEastCountry that can be
generated from the alphabet A= {believes/2, speaks/2}
of DATALOG binary predicate names, and the alphabets
Γ1 = {Language, Religion}
Γ2 = {IndoEuropeanLanguage, . . . , MonotheisticReligion, . . .}
Γ3 = {IndoIranianLanguage, . . . , MuslimReligion, . . .}
of ALC concept names for 1 ≤ l ≤ 3, up to maxD = 5.
Examples of O-queries in LCIA are:
                                                                                Fig. 5.   SW ING: step of relation selection.
Qt = q(X) ← & X:MiddleEastCountry
Q1 = q(X) ← speaks(X,Y) &
       X:MiddleEastCountry, Y:Language                                A wizard provides guidance for the selection of the (hybrid)
Q2 = q(X) ← speaks(X,Y) &                                          data set to be mined, the selection of the reference concept
       X:MiddleEastCountry, Y:IndoEuropeanLanguage                 and the task-relevant concepts (see Figure 4), the selection
Q3 = q(X) ← believes(X,Y)&                                         of the relations - among the ones appearing in the relational
       X:MiddleEastCountry, Y:MuslimReligion                       component of the data set chosen or derived from them -
where Qt is the trivial O-query for LCIA , Q1 ∈ L1 , Q2 ∈
                                                   CIA
                                                                   with which the task-relevant concepts can be linked to the
L2 , and Q3 ∈ L3 . Note that Q1 is an ancestor of Q2 .
 CIA             CIA
                                                                   reference concept in the patterns to be discovered (see Figure
  Minimum support thresholds are set to the following values:      5 and 6), the setting of minimum support thresholds for each
minsup1 = 20%, minsup2 = 13%, and minsup3 = 10%.                   level of description granularity and of several other parameters
After maxD = 5 search stages, AL-Q U I N returns 53 fre-           required by AL-Q U I N. These user preferences are collected
quent patterns out of 99 candidate patterns compliant with         in a file (see ouput file *.lb in Figure 3) that is shown in
                                                                   preview to the user at the end of the assisted procedure for
  6 http://www.odci.gov/cia/publications/factbook/                 confirmation (see Figure 7).
IndoEuropeanLanguage Language.
                                                                       IndoIranianLanguage IndoEuropeanLanguage.
                                                                       MonotheisticReligion Religion.
                                                                       MuslimReligion MonotheisticReligion.

                                                                       and membership assertions such as
                                                                       ’IR’:AsianCountry.
                                                                       ’Arab’:MiddleEastEthnicGroup.
                                                                       <’IR’,’Arab’>:Hosts.
                                                                       ’Persian’:IndoIranianLanguage.
                                                                       ’ShiaMuslim’:MuslimReligion.
                                                                       ’SunniMuslim’:MuslimReligion.

                                                                       that define taxonomies for the concepts Country,
                                                                       EthnicGroup, Language and Religion. Note that Middle
                                                                       East countries (concept MiddleEastCountry) have been
                 Fig. 6.   SW ING: editing of derived relations.       defined as Asian countries that host at least one Middle
                                                                       Eastern ethnic group. In particular, Iran (’IR’) is classified
                                                                       as Middle East country.
                                                                         Since    Cref =MiddleEastCountry,         the    DATALOG
                                                                       database is partitioned according to the individuals
                                                                       of MiddleEastCountry. In particular, the observation
                                                                       (q(’IR’), AIR ) contains DATALOG facts such as
                                                                       language(’IR’,’Persian’,58).
                                                                       religion(’IR’,’ShiaMuslim’,89).
                                                                       religion(’IR’,’SunniMuslim’,10).

                                                                       concerning the individual ’IR’.                              ♦
                                                                          The output file *.db contains the input DATALOG database
                                                                       eventually enriched with an intensional part. The editing of
                                                                       derived relations (see Figure 6) is accessible from the step of
                                                                       relation selection (see Figure 5).
       Fig. 7.    SW ING: preview of the language bias specification.
                                                                       Example IV.3. The output DATALOG database cia exp1.db
                                                                       for Example IV.1 enriches the input DATALOG database
                                                                       cia exp1.edb with the following two clauses:
A. A closer look to the I/O                                            speaks(Code, Lang)← language(Code,Lang,Perc),
   The input to SW ING is a hybrid knowledge base that                                  c Country(Code), c Language(Lang).
consists of an ontological data source - expressed as a OWL            believes(Code, Rel)←religion(Code,Rel,Perc),
file - and a relational data source - also available on the Web                          c Country(Code), c Religion(Rel).
- integrated with each other.
                                                                       that define views on the relations language and religion
Example IV.2. The knowledge base BCIA for the Semantic                 respectively. Note that they correspond to the constrained
Web Mining session of Example IV.1 integrates an OWL                   DATALOG clauses
ontology (file cia exp1.owl) with a DATALOG database (file
                                                                       speaks(Code, Lang)← language(Code,Lang,Perc) &
cia exp1.edb) containing facts7 extracted from the on-line
                                                                                           Code:Country, Lang:Language.
1996 CIA World Fact Book. The OWL ontology8 contains
                                                                       believes(Code, Rel)←religion(Code,Rel,Perc) &
axioms such as
                                                                                           Code:Country, Rel:Religion.
AsianCountry Country.
                                                                       and represent the intensional part of ΠCIA .                 ♦
MiddleEastEthnicGroup EthnicGroup.
MiddleEastCountry ≡                                                       The output file *.lb contains the declarative bias specifica-
    AsianCountry ∃Hosts.MiddleEastEthnicGroup.                         tion for the language of patterns and other directives.

  7 http://www.dbis.informatik.uni-goettingen.de/Mondial/              Example IV.4. With reference to Example IV.1, the content
mondial-rel-facts.flp                                                  of cia exp1.lb (see Figure 7) defines - among the other
  8 In the following we shall use the corresponding DL notation        things - the language LCIA of patterns. In particular the first
5 directives define the reference concept, the task-relevant          ...
concepts and and the relations between concepts.         ♦           hierarchy(c Language,3,c IndoEuropeanLanguage,
                                                                        [c IndoIranianLanguage, c SlavicLanguage]).
  The output files *.abox n and *.tbox are the side effect of
                                                                     hierarchy(c Language,3,c UralAltaicLanguage,
the step of concept selection as illustrated in the next section.
                                                                        [c TurkicLanguage]).
Note that these files together with the intensional part of the
                                                                     hierarchy(c Religion,3,c MonotheisticReligion,
*.db file form the background knowledge K for AL-Q U I N.
                                                                        [c ChristianReligion, c JewishReligion, c MuslimReligion]).
B. A look inside the step of concept selection                       for the layer T 3 .                                             ♦
   The step of concept selection deserves further remarks                                                                       OI
because it actually exploits the services offered by Prot´ g´ -
                                                           e e         Note that the translation from OWL to DATALOG          is
2000. Indeed it also triggers some supplementary computation         possible because we assume that all the concepts are named.
aimed at making a OWL background knowledge Σ usable                  This means that an equivalence axiom is required for each
by AL-Q U I N. To achieve this goal, it supplies the following       complex concept in the knowledge base. Equivalence axioms
functionalities:                                                     help keeping concept names (used within constrained DATA -
                                                                     LOG clauses) independent from concept definitions.
   • levelwise retrieval w.r.t. Σ
   • translation of both (asserted and derived) concept asser-
                                                                                            V. C ONCLUSION
     tions and subsumption axioms of Σ to DATALOGOI facts
The latter relies on the former, meaning that the results of the        The middleware SW ING supplies several facilities to AL-
levelwise retrieval are exported to DATALOGOI (see output            Q U I N, primarily facilities for compiling OWL down to
files *.abox n and *.tbox in Figure 3). The retrieval problem         DATALOG. Note that DATALOG is the usual KR&R setting
is known in DLs literature as the problem of retrieving all the      for ILP. In this respect, the pre-processing method proposed by
individuals of a concept C [21]. Here, the retrieval is called       Kietz [22] to enable ILP systems to work within the framework
levelwise because it follows the layering of T : individuals of      of the hybrid KR&R system CARIN [23] is related to ours
concepts belonging to the l-th layer T l of T are retrieved all      but it lacks an application. Analogously, the method proposed
together.                                                            in [24] for translating OWL to disjunctive DATALOG is far
                                                                     too general with respect to the specific needs of our applica-
Example IV.5. The DATALOGOI rewriting of the concept                 tion. Rather, the proposal of interfacing existing reasoners to
assertions derived for T 2 produces facts like:                      combine ontologies and rules [25] is more similar to ours in
c AfroAsiaticLanguage(’Arabic’).                                     the spirit. Furthermore, SW ING follows engineering principles
...                                                                  because it promotes the reuse of existing systems (AL-Q U I N
c IndoEuropeanLanguage(’Persian’).                                   and Prot´ g´ -2000) and the adherence to standards (either
                                                                                e e
...                                                                  normative - see OWL for the Semantic Web - or de facto - see
c UralAltaicLanguage(’Kazak’).                                       DATALOG for ILP). Finally the resulting artifact overcomes the
...                                                                  capabilities of the two systems when considered stand-alone.
c MonotheisticReligion(’ShiaMuslim’).                                In particular, AL-Q U I N was originally conceived to deal with
c MonotheisticReligion(’SunniMuslim’).                               ALC ontologies. Since OWL is equivalent to SHIQ [26] and
...                                                                  ALC is a fragment of SHIQ [21], the middleware SW ING
c PolytheisticReligion(’Druze’).                                     allows AL-Q U I N to deal with more expressive ontologies and
...                                                                  to face Semantic Web applications.
                                                                        For the future we plan to extend SW ING with facilities for
that are stored in the file cia exp1.abox 2.                          extracting information from semantic portals and for present-
   The file cia exp1.tbox contains a DATALOGOI rewriting              ing patterns generated by AL-Q U I N.
of the taxonomic relations of T such as:
hierarchy(c Language,1,null,[c Language]).                                                    R EFERENCES
hierarchy(c Religion,1,null,[c Religion]).                  [1] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Eds.,
                1                                               Advances in Knowledge Discovery and Data Mining. AAAI Press/The
for the layer T and                                             MIT Press, 1996.
                                                            [2] F. Lisi and F. Esposito, “ILP Meets Knowledge Engineering: A Case
hierarchy(c Language,2,c Language,                              Study.” in Inductive Logic Programming, ser. Lecture Notes in Artificial
   [c AfroAsiaticLanguage, c IndoEuropeanLanguage, . . .]).     Intelligence, S. Kramer and B. Pfahringer, Eds. Springer, 2005, vol.
hierarchy(c Religion,2,c Religion,                              3625, pp. 209–226.
                                                            [3] H. Mannila and H. Toivonen, “Levelwise search and borders of theories
   [c MonotheisticReligion, c PolytheisticReligion]).           in knowledge discovery,” Data Mining and Knowledge Discovery, vol. 1,
               2                                                no. 3, pp. 241–258, 1997.
for the layer T and                                         [4] F. Lisi and F. Esposito, “An ILP Perspective on the Semantic Web,”
                                                                in Semantic Web Applications and Perspectives 2005, P. Bouquet and
hierarchy(c Language,3,c AfroAsiaticLanguage,                   G. Tumarello, Eds. CEUR Workshop Proceedings, 2005, http://ceur-
   [c AfroAsiaticLanguage]).                                    ws.org/Vol-166/.
[5] F. Donini, M. Lenzerini, D. Nardi, and A. Schaerf, “AL-log: Integrating
     Datalog and Description Logics,” Journal of Intelligent Information
     Systems, vol. 10, no. 3, pp. 227–252, 1998.
 [6] S. Nienhuys-Cheng and R. de Wolf, Foundations of Inductive Logic
     Programming, ser. Lecture Notes in Artificial Intelligence. Springer,
     1997, vol. 1228.
 [7] G. Stumme, A. Hotho, and B. Berendt, “Semantic Web Mining: State of
     the art and future directions,” Journal of Web Semantics, vol. 4, no. 2,
     pp. 124–143, 2006.
 [8] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,”
     Scientific American, vol. May, 2001.
 [9] R. Kosala and H. Blockeel, “Web Mining Research: A Survey,”
     SIGKDD: SIGKDD Explorations: Newsletter of the Special Interest
     Group (SIG) on Knowledge Discovery & Data Mining, ACM, vol. 2,
     2000. [Online]. Available: citeseer.ist.psu.edu/kosala00web.html
[10] A. Maedche and S. Staab, “Discovering Conceptual Relations from
     Text,” in Proceedings of the 14th European Conference on Artificial
     Intelligence, W. Horn, Ed. IOS Press, 2000, pp. 321–325.
[11] A. G´ mez-P´ rez, M. Fern´ ndez-L´ pez, and O. Corcho, Ontological
           o       e              a        o
     Engineering. Springer, 2004.
[12] N. F. Noy, R. Fergerson, and M. Musen, “The Knowledge Model of
     Prot´ g´ -2000: Combining Interoperability and Flexibility.” in Knowledge
          e e
     Acquisition, Modeling and Management, ser. Lecture Notes in Computer
     Science, R. Dieng and O. Corby, Eds. Springer, 2000, vol. 1937, pp.
     17–32.
[13] F. Lisi and D. Malerba, “Inducing Multi-Level Association Rules from
     Multiple Relations,” Machine Learning, vol. 55, pp. 175–210, 2004.
[14] J. Han and Y. Fu, “Mining multiple-level association rules in large
     databases,” IEEE Transactions on Knowledge and Data Engineering,
     vol. 11, no. 5, 1999.
[15] M. Schmidt-Schauss and G. Smolka, “Attributive concept descriptions
     with complements,” Artificial Intelligence, vol. 48, no. 1, pp. 1–26, 1991.
[16] S. Ceri, G. Gottlob, and L. Tanca, Logic Programming and Databases.
     Springer, 1990.
[17] G. Semeraro, F. Esposito, D. Malerba, N. Fanizzi, and S. Ferilli, “A
     logic framework for the incremental inductive synthesis of Datalog the-
     ories,” in Proceedings of 7th International Workshop on Logic Program
     Synthesis and Transformation, ser. Lecture Notes in Computer Science,
     N. Fuchs, Ed. Springer, 1998, vol. 1463, pp. 300–321.
[18] J. Gennari, M. Musen, R. Fergerson, W. Grosso, M. Crub´ zy, H. Eriks-
                                                                  e
     son, N. F. Noy, and S. W. Tu, “The evolution of Prot´ g´ : An environment
                                                          e e
     for knowledge-based systems development.” International Journal of
     Human-Computer Studies, vol. 58, no. 1, pp. 89–123, 2003.
[19] H. Knublauch, M. Musen, and A. Rector, “Editing Description Logic
     Ontologies with the Prot´ g´ OWL Plugin.” in Proceedings of the 2004
                                e e
     International Workshop on Description Logics (DL2004), ser. CEUR
     Workshop Proceedings, V. Haarslev and R. M¨ ller, Eds., vol. 104, 2004.
                                                     o
[20] V. Haarslev and R. M¨ ller, “Description of the RACER System and
                              o
     its Applications.” in Working Notes of the 2001 International Descrip-
     tion Logics Workshop (DL-2001), ser. CEUR Workshop Proceedings,
     C. Goble, D. McGuinness, R. M¨ ller, and P. Patel-Schneider, Eds.,
                                          o
     vol. 49, 2001.
[21] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-
     Schneider, Eds., The Description Logic Handbook: Theory, Implemen-
     tation and Applications. Cambridge University Press, 2003.
[22] J. Kietz, “Learnability of description logic programs,” in Inductive Logic
     Programming, ser. Lecture Notes in Artificial Intelligence, S. Matwin
     and C. Sammut, Eds., vol. 2583. Springer, 2003, pp. 117–132.
[23] A. Levy and M.-C. Rousset, “Combining Horn rules and description
     logics in CARIN,” Artificial Intelligence, vol. 104, pp. 165–209, 1998.
[24] U. Hustadt, B. Motik, and U. Sattler, “Reducing SHIQ-description
     logic to disjunctive datalog programs.” in Principles of Knowledge
     Representation and Reasoning: Proceedings of the Ninth International
     Conference (KR2004), D. Dubois, C. Welty, and M.-A. Williams, Eds.
     AAAI Press, 2004, pp. 152–162.
[25] U. Assmann, J. Henriksson, and J.Maluszynski, “Combining safe rules
     and ontologies by interfacing of reasoners.” in Principles and Practice
     of Semantic Web Reasoning, ser. Lecture Notes in Computer Science,
     J. Alferes, J. Bailey, W. May, and U. Schwertel, Eds. Springer, 2006,
     vol. 4187, pp. 33–47.
[26] I. Horrocks, P. Patel-Schneider, and F. van Harmelen, “From SHIQ
     and RDF to OWL: The making of a web ontology language,” Journal
     of Web Semantics, vol. 1, no. 1, pp. 7–26, 2003.

More Related Content

What's hot

PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...
PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...
PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...ijcseit
 
Implementação do Hash Coalha/Coalesced
Implementação do Hash Coalha/CoalescedImplementação do Hash Coalha/Coalesced
Implementação do Hash Coalha/CoalescedCriatividadeZeroDocs
 
Advanced Data Structures 2006
Advanced Data Structures 2006Advanced Data Structures 2006
Advanced Data Structures 2006Sanjay Goel
 
Pauls klein 2011-lm_paper(3)
Pauls klein 2011-lm_paper(3)Pauls klein 2011-lm_paper(3)
Pauls klein 2011-lm_paper(3)Red Over
 
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eAnalysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eIJERD Editor
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachIJECEIAES
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentationSoojung Hong
 
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)Enrique Monzo Solves
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Cemal Ardil
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPCVictor Eijkhout
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsbutest
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...ijwmn
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...ijceronline
 
An Approach for Constructing a Domain Definition Metamodel with ATL
An Approach for Constructing a Domain Definition Metamodel with ATLAn Approach for Constructing a Domain Definition Metamodel with ATL
An Approach for Constructing a Domain Definition Metamodel with ATLVanea Chiprianov
 
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...zaidinvisible
 

What's hot (20)

Ak04605259264
Ak04605259264Ak04605259264
Ak04605259264
 
PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...
PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...
PERFORMANCE OF ITERATIVE LDPC-BASED SPACE-TIME TRELLIS CODED MIMO-OFDM SYSTEM...
 
Cbls thuan
Cbls thuanCbls thuan
Cbls thuan
 
Implementação do Hash Coalha/Coalesced
Implementação do Hash Coalha/CoalescedImplementação do Hash Coalha/Coalesced
Implementação do Hash Coalha/Coalesced
 
Advanced Data Structures 2006
Advanced Data Structures 2006Advanced Data Structures 2006
Advanced Data Structures 2006
 
Pauls klein 2011-lm_paper(3)
Pauls klein 2011-lm_paper(3)Pauls klein 2011-lm_paper(3)
Pauls klein 2011-lm_paper(3)
 
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eAnalysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approach
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Final
FinalFinal
Final
 
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical models
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
 
V25112115
V25112115V25112115
V25112115
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
 
An Approach for Constructing a Domain Definition Metamodel with ATL
An Approach for Constructing a Domain Definition Metamodel with ATLAn Approach for Constructing a Domain Definition Metamodel with ATL
An Approach for Constructing a Domain Definition Metamodel with ATL
 
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
 

Viewers also liked

De drijvende kracht, je vrijwilligers
De drijvende kracht, je vrijwilligersDe drijvende kracht, je vrijwilligers
De drijvende kracht, je vrijwilligersPower2Improve
 
Mohamed Helaly-2016
Mohamed Helaly-2016Mohamed Helaly-2016
Mohamed Helaly-2016helaly2010
 
Ed Hardy Blackberry accessories
Ed Hardy Blackberry accessoriesEd Hardy Blackberry accessories
Ed Hardy Blackberry accessoriesMaverick Inc.
 
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBER
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBERINTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBER
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBERmceide
 
Mceider Pac1
Mceider Pac1Mceider Pac1
Mceider Pac1mceide
 
Maqueta EVA - Hipatia
Maqueta EVA - HipatiaMaqueta EVA - Hipatia
Maqueta EVA - Hipatiamceide
 
Espacios en-tu-vida
Espacios en-tu-vidaEspacios en-tu-vida
Espacios en-tu-vidaepacheco1
 
Iba ba skills 160621 handout
Iba ba skills 160621 handoutIba ba skills 160621 handout
Iba ba skills 160621 handoutPower2Improve
 
Nov25th violence againstwomen
Nov25th violence againstwomenNov25th violence againstwomen
Nov25th violence againstwomenmceide
 
Milan kukrika pecb certifikati
Milan kukrika pecb certifikatiMilan kukrika pecb certifikati
Milan kukrika pecb certifikatiMilan Kukrika
 
Sosiologi Pendidikan Madin
Sosiologi Pendidikan MadinSosiologi Pendidikan Madin
Sosiologi Pendidikan Madingueste58fab07
 
Griffin presentation email
Griffin presentation emailGriffin presentation email
Griffin presentation emailMaverick Inc.
 
Ed Hardy iPhone accessories
Ed Hardy iPhone accessoriesEd Hardy iPhone accessories
Ed Hardy iPhone accessoriesMaverick Inc.
 
Aula Oberta Ítaca 08-09
Aula Oberta Ítaca 08-09Aula Oberta Ítaca 08-09
Aula Oberta Ítaca 08-09mceide
 
TEANO - PREVISIÓ DE PROBLEMES
TEANO - PREVISIÓ DE PROBLEMESTEANO - PREVISIÓ DE PROBLEMES
TEANO - PREVISIÓ DE PROBLEMESmceide
 
Sosiologi Pendidikan Madin
Sosiologi Pendidikan MadinSosiologi Pendidikan Madin
Sosiologi Pendidikan Madingueste58fab07
 
Mceider Pac1
Mceider Pac1Mceider Pac1
Mceider Pac1mceide
 

Viewers also liked (18)

De drijvende kracht, je vrijwilligers
De drijvende kracht, je vrijwilligersDe drijvende kracht, je vrijwilligers
De drijvende kracht, je vrijwilligers
 
Mohamed Helaly-2016
Mohamed Helaly-2016Mohamed Helaly-2016
Mohamed Helaly-2016
 
Ed Hardy Blackberry accessories
Ed Hardy Blackberry accessoriesEd Hardy Blackberry accessories
Ed Hardy Blackberry accessories
 
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBER
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBERINTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBER
INTERNATIONAL DAY FOR THE ELIMINATION OF VIOLENCE AGAINST WOMEN - 25th NOVEMBER
 
file
filefile
file
 
Mceider Pac1
Mceider Pac1Mceider Pac1
Mceider Pac1
 
Maqueta EVA - Hipatia
Maqueta EVA - HipatiaMaqueta EVA - Hipatia
Maqueta EVA - Hipatia
 
Espacios en-tu-vida
Espacios en-tu-vidaEspacios en-tu-vida
Espacios en-tu-vida
 
Iba ba skills 160621 handout
Iba ba skills 160621 handoutIba ba skills 160621 handout
Iba ba skills 160621 handout
 
Nov25th violence againstwomen
Nov25th violence againstwomenNov25th violence againstwomen
Nov25th violence againstwomen
 
Milan kukrika pecb certifikati
Milan kukrika pecb certifikatiMilan kukrika pecb certifikati
Milan kukrika pecb certifikati
 
Sosiologi Pendidikan Madin
Sosiologi Pendidikan MadinSosiologi Pendidikan Madin
Sosiologi Pendidikan Madin
 
Griffin presentation email
Griffin presentation emailGriffin presentation email
Griffin presentation email
 
Ed Hardy iPhone accessories
Ed Hardy iPhone accessoriesEd Hardy iPhone accessories
Ed Hardy iPhone accessories
 
Aula Oberta Ítaca 08-09
Aula Oberta Ítaca 08-09Aula Oberta Ítaca 08-09
Aula Oberta Ítaca 08-09
 
TEANO - PREVISIÓ DE PROBLEMES
TEANO - PREVISIÓ DE PROBLEMESTEANO - PREVISIÓ DE PROBLEMES
TEANO - PREVISIÓ DE PROBLEMES
 
Sosiologi Pendidikan Madin
Sosiologi Pendidikan MadinSosiologi Pendidikan Madin
Sosiologi Pendidikan Madin
 
Mceider Pac1
Mceider Pac1Mceider Pac1
Mceider Pac1
 

Similar to 10.1.1.70.8789

Abstracting Strings For Model Checking Of C Programs
Abstracting Strings For Model Checking Of C ProgramsAbstracting Strings For Model Checking Of C Programs
Abstracting Strings For Model Checking Of C ProgramsMartha Brown
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...DBOnto
 
Phenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable PhenotypesPhenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable PhenotypesMartin Chapman
 
SELFLESS INHERITANCE
SELFLESS INHERITANCESELFLESS INHERITANCE
SELFLESS INHERITANCEijpla
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paperDBOnto
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paperDBOnto
 
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsUsing Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsWaqas Tariq
 
Fuzzy formal concept analysis: Approaches, applications and issues
Fuzzy formal concept analysis: Approaches, applications and issuesFuzzy formal concept analysis: Approaches, applications and issues
Fuzzy formal concept analysis: Approaches, applications and issuesCSITiaesprime
 
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWSAN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWSgerogepatton
 
An ai planning approach for generating
An ai planning approach for generatingAn ai planning approach for generating
An ai planning approach for generatingijaia
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286Ninad Samel
 
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEMPEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEMijcisjournal
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationEugene Nho
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Editor IJARCET
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6umavanth
 

Similar to 10.1.1.70.8789 (20)

Abstracting Strings For Model Checking Of C Programs
Abstracting Strings For Model Checking Of C ProgramsAbstracting Strings For Model Checking Of C Programs
Abstracting Strings For Model Checking Of C Programs
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
 
Phenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable PhenotypesPhenoflow: An Architecture for Computable Phenotypes
Phenoflow: An Architecture for Computable Phenotypes
 
SELFLESS INHERITANCE
SELFLESS INHERITANCESELFLESS INHERITANCE
SELFLESS INHERITANCE
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Marvin_Capstone
Marvin_CapstoneMarvin_Capstone
Marvin_Capstone
 
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN ModelsUsing Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
Using Met-modeling Graph Grammars and R-Maude to Process and Simulate LRN Models
 
Fuzzy formal concept analysis: Approaches, applications and issues
Fuzzy formal concept analysis: Approaches, applications and issuesFuzzy formal concept analysis: Approaches, applications and issues
Fuzzy formal concept analysis: Approaches, applications and issues
 
TEXT CLUSTERING.doc
TEXT CLUSTERING.docTEXT CLUSTERING.doc
TEXT CLUSTERING.doc
 
Dawak f v.6camera-1
Dawak f v.6camera-1Dawak f v.6camera-1
Dawak f v.6camera-1
 
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWSAN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
 
An ai planning approach for generating
An ai planning approach for generatingAn ai planning approach for generating
An ai planning approach for generating
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286
 
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEMPEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
 
ME Synopsis
ME SynopsisME Synopsis
ME Synopsis
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic Classification
 
Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678Ijarcet vol-2-issue-2-676-678
Ijarcet vol-2-issue-2-676-678
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6
 

10.1.1.70.8789

  • 1. Enabling Data Mining Systems to Semantic Web Applications Francesca A. Lisi Dipartimento di Informatica, Universit` degli Studi di Bari, a Via E. Orabona 4, I-70125 Bari, Italy Email: lisi@di.uniba.it Abstract— Semantic Web Mining can be considered as Data of frequent pattern discovery [3]. It implements a framework Mining (DM) for/from the Semantic Web. Current DM systems for learning Semantic Web rules [4] which adopts AL-log could serve the purpose of Semantic Web Mining if they were [5] as the Knowledge Representation and Reasoning (KR&R) more compliant with, e.g., the standards of representation for ontologies and rules in the Semantic Web and/or interoperable setting and Inductive Logic Programming (ILP) [6] as the with well-established tools for Ontological Engineering (OE) that methodological apparatus. support these standards. In this paper we present a middleware, Semantic Web Mining [7] is a new application area which SW ING, that integrates the DM system AL-Q U I N and the OE aims at combining the two areas of Semantic Web [8] and Web tool Prot´ g´ -2000 in order to enable AL-Q U I N to Semantic Web e e Mining [9] from a twofold perspective. On one hand, the new applications. This showcase suggests a methodology for building Semantic Web Mining systems. semantic structures in the Web can be exploited to improve the results of Web Mining. On the other hand, the results of I. I NTRODUCTION Web Mining can be used for building the Semantic Web. Most Data Mining (DM) is an application area arisen in the 1990s work in Semantic Web Mining simply extends previous work at the intersection of several different research fields, notably to the new application context. E.g., Maedche and Staab [10] Statistics, Machine Learning and Databases, as soon as devel- apply a well-known algorithm for association rule mining to opments in sensing, communications and storage technologies discover conceptual relations from text. Indeed, we argue that made it possible to collect and store large collections of scien- Semantic Web Mining can be considered as DM for/from the tific and commercial data [1]. The abilities to analyze such data Semantic Web. Current DM systems could serve the purpose sets had not developed as fast. Research in DM can be loosely of Semantic Web Mining if they were more compliant with, defined as the study of methods, techniques and algorithms for e.g., the standards of representation for ontologies and rules in finding models or patterns that are interesting or valuable in the Semantic Web and/or interoperable with well-established large data sets. The space of patterns if often infinite, and the tools for Ontological Engineering (OE) [11], e.g. Prot´ g´ -2000 e e enumeration of patterns involves some form of search in one [12], that support these standards. such space. Practical computational constraints place severe In this paper we present a middleware, SW ING, that inte- limits on the subspace that can be explored by a data mining grates AL-Q U I N and Prot´ g´ -2000 in order to enable Seman- e e algorithm. The goal of DM is either prediction or description. tic Web applications of AL-Q U I N. This solution suggests a Prediction involves using some variables or fields in the methodology for building Semantic Web Mining systems, i.e. database to predict unknown or future values of other variables the upgrade of existing DM systems with facilities provided of interest. Description focuses on finding human-interpretable by interoperable OE tools. patterns describing data. Among descriptive tasks, data sum- The paper is structured as follows. Section II and III marization aims at the extraction of compact patterns that briefly introduce AL-Q U I N and Prot´ g´ -2000 respectively. e e describe subsets of data. There are two classes of methods Section IV presents the middleware SW ING. Section V draws which represent taking horizontal (cases) and vertical (fields) conclusions and outlines directions of future work. slices of the data. In the former, one would like to produce summaries of subsets, e.g. producing sufficient statistics or II. T HE DM SYSTEM AL-Q U I N logical conditions that hold for subsets. In the latter case, one The system AL-Q U I N [2] (a previous version is described would like to describe relations between fields. This class of in [13]) supports a variant of the DM task of frequent pattern methods is distinguished from the above in that rather than discovery. In DM a pattern is considered as an intensional predicting the value of a specified field (e.g., classification) description (expressed in a given language L) of a subset or grouping cases together (e.g. clustering) the goal is to find of r. The support of a pattern is the relative frequency of relations between fields. One common output of this vertical the pattern within r and is computed with the evaluation data summarization is called frequent (association) patterns. function supp. The task of frequent pattern discovery aims at These patterns state that certain combinations of values occur the extraction of all frequent patterns, i.e. all patterns whose in a given database with a support greater than a user-defined support exceeds a user-defined threshold of minimum support. threshold. The system AL-Q U I N [2] supports the DM task The blueprint of most algorithms for frequent pattern discovery
  • 2. form of DATALOG [16] that is obtained by using ALC concept assertions essentially as type constraints on variables. The portion K of B which encompasses the whole Σ and the intensional part (IDB) of Π is considered as background knowledge. The extensional part of Π is partitioned into portions Ai each of which refers to an individual ai of Cref . The link between Ai and ai is represented with the DATALOG literal q(ai ). The pair (q(ai ), Ai ) is called observation. The language L = {Ll }1≤l≤maxG of patterns allows for the generation of AL-log unary conjunctive queries, called O-queries. Given a reference concept Cref , an O-query Q to an AL-log knowledge base B is a (linked and connected)1 constrained DATALOG clause of the form Fig. 1. Organization of the hybrid knowledge bases used in AL-Q U I N. Q = q(X) ← α1 , . . . , αm &X : Cref , γ1 , . . . , γn where X is the distinguished variable and the remaining variables occurring in the body of Q are the existential is the levelwise search [3]. It is based on the following variables. Note that αj , 1 ≤ j ≤ m, is a DATALOG literal assumption: If a generality order for the language L of whereas γk , 1 ≤ k ≤ n, is an assertion that constrains a patterns can be found such that is monotonic w.r.t. supp, variable already appearing in any of the αj ’s to vary in the then the resulting space (L, ) can be searched breadth-first range of individuals of a concept defined in B. The O-query starting from the most general pattern in L and by alternating Qt = q(X) ← &X : Cref candidate generation and candidate evaluation phases. In particular, candidate generation consists of a refinement step is called trivial for L because it only contains the constraint followed by a pruning step. The former derives candidates for the distinguished variable X. Furthermore the language for the current search level from patterns found frequent in L is multi-grained, i.e. it contains expressions at multiple the previous search level. The latter allows some infrequent levels of description granularity. Indeed it is implicitly defined patterns to be detected and discarded prior to evaluation thanks by a declarative bias specification which consists of a finite to the monotonicity of . alphabet A of DATALOG predicate names and finite alphabets The variant of the frequent pattern discovery problem which Γl (one for each level l of description granularity) of ALC is solved by AL-Q U I N takes concept hierarchies into account concept names. Note that the αi ’s are taken from A and γj ’s during the discovery process [14], thus yielding descriptions are taken from Γl . We impose L to be finite by specifying of a data set r at multiple granularity levels up to a maximum some bounds, mainly maxD for the maximum depth of search level maxG. More formally, given and maxG for the maximum level of granularity. The support of an O-query Q ∈ Ll w.r.t an AL-log • a data set r including a taxonomy T where a reference knowledge base B is defined as concept Cref and task-relevant concepts are designated, • a multi-grained language {Ll }1≤l≤maxG of patterns supp(Q, B) =| answerset(Q, B) | / | answerset(Qt , B) | • a set {minsupl }1≤l≤maxG of minimum support thresh- where Qt is the trivial O-query for L. The computation of olds support relies on query answering in AL-log. Indeed, an the problem of frequent pattern discovery at l levels of answer to an O-query Q is a ground substitution θ for the description granularity, 1 ≤ l ≤ maxG, is to find the set distinguished variable of Q. An answer θ to an O-query Q is F of all the patterns P ∈ Ll frequent in r, namely P ’s with a correct (resp. computed) answer w.r.t. an AL-log knowledge support s such that (i) s ≥ minsupl and (ii) all ancestors of base B if there exists at least one correct (resp. computed) P w.r.t. T are frequent. Note that a pattern Q is considered answer to body(Q)θ w.r.t. B. Therefore proving that an O- to be an ancestor of P if it is a coarser-grained version of P . query Q covers an observation (q(ai ), Ai ) w.r.t. K equals to In AL-Q U I N (AL-log Q Uery I Nduction) the data set r is proving that θi = {X/ai } is a correct answer to Q w.r.t. represented as an AL-log knowledge base B and structured Bi = K ∪ Ai . as illustrated in Figure 1. The structural subsystem Σ is based The system AL-Q U I N implements the aforementioned lev- on ALC [15] and allows for the specification of knowledge in elwise search method for frequent pattern discovery. In par- terms of classes (concepts), binary relations between classes ticular, candidate patterns of a certain level k (called k- (roles), and instances (individuals). In particular, the TBox T patterns) are obtained by refinement of the frequent patterns contains is-a relations between concepts (axioms) whereas the discovered at level k − 1. In AL-Q U I N patterns are ordered ABox M contains instance-of relations between individuals according to B-subsumption (which has been proved to fulfill (resp. couples of individuals) and concepts (resp. roles) (as- sertions). The relational subsystem Π is based on an extended 1 For the definition of linkedness and connectedness see [6].
  • 3. the abovementioned condition of monotonicity [13]). The search starts from the most general pattern in L and iterates through the generation-evaluation cycle for a number of times that is bounded with respect to both the granularity level l (maxG) and the depth level k (maxD). Since AL-Q U I N is implemented with Prolog, the internal representation language in AL-Q U I N is a kind of DATALOGOI [17], i.e. the subset of DATALOG= equipped with an equational theory that consists of the axioms of Clark’s Equality Theory augmented with one rewriting rule that adds inequality atoms s = t to any P ∈ L for each pair (s, t) of distinct terms occurring in P . Note that concept assertions are rendered as membership atoms, e.g. a : C becomes c C(a). ´ ´ III. T HE OE TOOL P ROT E G E -2000 Fig. 2. Architecture of the OWL Plugin for Prot´ g´ -2000. e e Prot´ g´ -20002 [18] is the latest version of the Prot´ g´ line e e e e of tools, created by the Stanford Medical Informatics (SMI) group at Stanford University, USA. It has a community of thousands of users. Although the development of Prot´ g´ has e e historically been mainly driven by biomedical applications, the Prot´ g´ -2000’s form editor, where users can select alternative e e system is domain-independent and has been successfully used user interface widgets for their project. The user interface for many other application areas as well. Prot´ g´ -2000 is a e e consists of panels (tabs) for editing classes, properties, forms Java-based standalone application to be installed and run in a and instances. local computer. The core of this application is the ontology Prot´ g´ -2000 has an extensible architecture, i.e. an architec- e e editor. Like most other modeling tools, the architecture of ture that allows special-purpose extensions (aka plug-ins) to be Prot´ g´ -2000 is cleanly separated into a model part and a e e easily integrated. These extensions usually perform functions view part. Prot´ g´ -2000’s model is the internal representation e e not provided by the Prot´ g´ -2000 standard distribution (other e e mechanism for ontologies and knowledge bases. Prot´ g´ - e e types of visualization, new import and export formats, etc.), 2000’s view components provide a Graphical User Interface implement applications that use Prot´ g´ -2000 ontologies, or e e (GUI) to display and manipulate the underlying model. allow configuring the ontology editor. Most of these plug- Prot´ g´ -2000’s model is based on a simple yet flexible e e ins are available in the Prot´ g´ -2000 Plug-in Library, where e e metamodel [12], which is comparable to object-oriented and contributions from many different research groups can be frame-based systems. It basically can represent ontologies found. One of the most popular in this library is the OWL consisting of classes, properties (slots), property characteristics Plugin [19]. (facets and constraints), and instances. Prot´ g´ -2000 provides e e As illustrated in Figure 2, the OWL Plugin extends the an open Java API to query and manipulate models. An Prot´ g´ -2000 model and its API with classes to represent the e e important strength of Prot´ g´ -2000 is that the Prot´ g´ -2000 e e e e OWL3 specification. In particular it supports RDF(S), OWL metamodel itself is a Prot´ g´ -2000 ontology, with classes that e e Lite, OWL DL (except for anonymous global class axioms, represent classes, properties, and so on. For example, the de- which need to be given a name by the user) and significant fault class in the Protege base system is called :STANDARD- parts of OWL Full (including metaclasses). The OWL API CLASS, and has properties such as :NAME and :DIRECT- basically encapsulates the internal mapping and thus shields SUPERCLASSES. This structure of the metamodel enables the user from error-prone low-level access. Furthermore the easy extension and adaption to other representations. OWL Plugin provides a comprehensive mapping between its Using the views of Prot´ g´ -2000’s GUI, ontology designers e e extended API and the standard OWL parsing library Jena4 . The basically create classes, assign properties to the classes, and presence of a secondary representation of an OWL ontology then restrict the properties facets at certain classes. Using in terms of Jena objects means that the user is able to invoke the resulting ontologies, Prot´ g´ -2000 is able to automatically e e arbitrary Jena-based services such as interfaces to classifiers, generate user interfaces that support the creation of individuals query languages, or visualization tools permanently. Based on (instances). For each class in the ontology, the system creates the above mentioned metamodel and API extensions, the OWL one form with editing components (widgets) for each property Plugin provides several custom-tailored GUI components for of the class. For example, for properties that can take single OWL. Also it can directly access DL reasoners such as string values, the system would by default provide a text field RACER [20]. Finally it can be further extended, e.g. to support widget. The generated forms can be further customized with OWL-based languages like SWRL5 . 2 The distribution of interest to this work is 3.0 (February 2005), freely 3 http://www.w3.org/2004/OWL/ 4 http://jena.sourceforge.net available at http://protege.stanford.edu/ under the Mozilla open- source license. 5 http://www.w3.org/Submission/SWRL/
  • 4. the parameter settings. One of these findings is the pattern Q2 which turns out to be frequent because it has support supp(Q2 , BCIA ) = 13% (≥ minsup2 ). This has to be read as ’13 % of Middle East countries speak an Indoeuropean language’. ♦ Fig. 3. Architecture and I/O of SW ING. IV. E NABLING AL-Q U I N TO S EMANTIC W EB ´ ´ APPLICATIONS WITH P ROT E G E -2000 To enable AL-Q U I N to Semantic Web applications we have developed a software component, SW ING, that assists users of AL-Q U I N in the design of Semantic Web Mining sessions. As illustrated in Figure 3, SW ING is a middleware because it interoperates via API with the OWL Plugin for Prot´ g´ -2000 e e to benefit from its facilities for browsing and reasoning on OWL ontologies. Fig. 4. SW ING: step of concept selection. Example IV.1. The screenshots reported in Figure 4, 5, 6 and 7 refer to a Semantic Web Mining session with SW ING for the task of finding frequent patterns in the on-line CIA World Fact Book6 (data set) that describe Middle East coun- tries (reference concept) w.r.t. the religions believed and the languages spoken (task-relevant concepts) at three levels of granularity (maxG = 3). To this aim we define LCIA as the set of O-queries with Cref = MiddleEastCountry that can be generated from the alphabet A= {believes/2, speaks/2} of DATALOG binary predicate names, and the alphabets Γ1 = {Language, Religion} Γ2 = {IndoEuropeanLanguage, . . . , MonotheisticReligion, . . .} Γ3 = {IndoIranianLanguage, . . . , MuslimReligion, . . .} of ALC concept names for 1 ≤ l ≤ 3, up to maxD = 5. Examples of O-queries in LCIA are: Fig. 5. SW ING: step of relation selection. Qt = q(X) ← & X:MiddleEastCountry Q1 = q(X) ← speaks(X,Y) & X:MiddleEastCountry, Y:Language A wizard provides guidance for the selection of the (hybrid) Q2 = q(X) ← speaks(X,Y) & data set to be mined, the selection of the reference concept X:MiddleEastCountry, Y:IndoEuropeanLanguage and the task-relevant concepts (see Figure 4), the selection Q3 = q(X) ← believes(X,Y)& of the relations - among the ones appearing in the relational X:MiddleEastCountry, Y:MuslimReligion component of the data set chosen or derived from them - where Qt is the trivial O-query for LCIA , Q1 ∈ L1 , Q2 ∈ CIA with which the task-relevant concepts can be linked to the L2 , and Q3 ∈ L3 . Note that Q1 is an ancestor of Q2 . CIA CIA reference concept in the patterns to be discovered (see Figure Minimum support thresholds are set to the following values: 5 and 6), the setting of minimum support thresholds for each minsup1 = 20%, minsup2 = 13%, and minsup3 = 10%. level of description granularity and of several other parameters After maxD = 5 search stages, AL-Q U I N returns 53 fre- required by AL-Q U I N. These user preferences are collected quent patterns out of 99 candidate patterns compliant with in a file (see ouput file *.lb in Figure 3) that is shown in preview to the user at the end of the assisted procedure for 6 http://www.odci.gov/cia/publications/factbook/ confirmation (see Figure 7).
  • 5. IndoEuropeanLanguage Language. IndoIranianLanguage IndoEuropeanLanguage. MonotheisticReligion Religion. MuslimReligion MonotheisticReligion. and membership assertions such as ’IR’:AsianCountry. ’Arab’:MiddleEastEthnicGroup. <’IR’,’Arab’>:Hosts. ’Persian’:IndoIranianLanguage. ’ShiaMuslim’:MuslimReligion. ’SunniMuslim’:MuslimReligion. that define taxonomies for the concepts Country, EthnicGroup, Language and Religion. Note that Middle East countries (concept MiddleEastCountry) have been Fig. 6. SW ING: editing of derived relations. defined as Asian countries that host at least one Middle Eastern ethnic group. In particular, Iran (’IR’) is classified as Middle East country. Since Cref =MiddleEastCountry, the DATALOG database is partitioned according to the individuals of MiddleEastCountry. In particular, the observation (q(’IR’), AIR ) contains DATALOG facts such as language(’IR’,’Persian’,58). religion(’IR’,’ShiaMuslim’,89). religion(’IR’,’SunniMuslim’,10). concerning the individual ’IR’. ♦ The output file *.db contains the input DATALOG database eventually enriched with an intensional part. The editing of derived relations (see Figure 6) is accessible from the step of relation selection (see Figure 5). Fig. 7. SW ING: preview of the language bias specification. Example IV.3. The output DATALOG database cia exp1.db for Example IV.1 enriches the input DATALOG database cia exp1.edb with the following two clauses: A. A closer look to the I/O speaks(Code, Lang)← language(Code,Lang,Perc), The input to SW ING is a hybrid knowledge base that c Country(Code), c Language(Lang). consists of an ontological data source - expressed as a OWL believes(Code, Rel)←religion(Code,Rel,Perc), file - and a relational data source - also available on the Web c Country(Code), c Religion(Rel). - integrated with each other. that define views on the relations language and religion Example IV.2. The knowledge base BCIA for the Semantic respectively. Note that they correspond to the constrained Web Mining session of Example IV.1 integrates an OWL DATALOG clauses ontology (file cia exp1.owl) with a DATALOG database (file speaks(Code, Lang)← language(Code,Lang,Perc) & cia exp1.edb) containing facts7 extracted from the on-line Code:Country, Lang:Language. 1996 CIA World Fact Book. The OWL ontology8 contains believes(Code, Rel)←religion(Code,Rel,Perc) & axioms such as Code:Country, Rel:Religion. AsianCountry Country. and represent the intensional part of ΠCIA . ♦ MiddleEastEthnicGroup EthnicGroup. MiddleEastCountry ≡ The output file *.lb contains the declarative bias specifica- AsianCountry ∃Hosts.MiddleEastEthnicGroup. tion for the language of patterns and other directives. 7 http://www.dbis.informatik.uni-goettingen.de/Mondial/ Example IV.4. With reference to Example IV.1, the content mondial-rel-facts.flp of cia exp1.lb (see Figure 7) defines - among the other 8 In the following we shall use the corresponding DL notation things - the language LCIA of patterns. In particular the first
  • 6. 5 directives define the reference concept, the task-relevant ... concepts and and the relations between concepts. ♦ hierarchy(c Language,3,c IndoEuropeanLanguage, [c IndoIranianLanguage, c SlavicLanguage]). The output files *.abox n and *.tbox are the side effect of hierarchy(c Language,3,c UralAltaicLanguage, the step of concept selection as illustrated in the next section. [c TurkicLanguage]). Note that these files together with the intensional part of the hierarchy(c Religion,3,c MonotheisticReligion, *.db file form the background knowledge K for AL-Q U I N. [c ChristianReligion, c JewishReligion, c MuslimReligion]). B. A look inside the step of concept selection for the layer T 3 . ♦ The step of concept selection deserves further remarks OI because it actually exploits the services offered by Prot´ g´ - e e Note that the translation from OWL to DATALOG is 2000. Indeed it also triggers some supplementary computation possible because we assume that all the concepts are named. aimed at making a OWL background knowledge Σ usable This means that an equivalence axiom is required for each by AL-Q U I N. To achieve this goal, it supplies the following complex concept in the knowledge base. Equivalence axioms functionalities: help keeping concept names (used within constrained DATA - LOG clauses) independent from concept definitions. • levelwise retrieval w.r.t. Σ • translation of both (asserted and derived) concept asser- V. C ONCLUSION tions and subsumption axioms of Σ to DATALOGOI facts The latter relies on the former, meaning that the results of the The middleware SW ING supplies several facilities to AL- levelwise retrieval are exported to DATALOGOI (see output Q U I N, primarily facilities for compiling OWL down to files *.abox n and *.tbox in Figure 3). The retrieval problem DATALOG. Note that DATALOG is the usual KR&R setting is known in DLs literature as the problem of retrieving all the for ILP. In this respect, the pre-processing method proposed by individuals of a concept C [21]. Here, the retrieval is called Kietz [22] to enable ILP systems to work within the framework levelwise because it follows the layering of T : individuals of of the hybrid KR&R system CARIN [23] is related to ours concepts belonging to the l-th layer T l of T are retrieved all but it lacks an application. Analogously, the method proposed together. in [24] for translating OWL to disjunctive DATALOG is far too general with respect to the specific needs of our applica- Example IV.5. The DATALOGOI rewriting of the concept tion. Rather, the proposal of interfacing existing reasoners to assertions derived for T 2 produces facts like: combine ontologies and rules [25] is more similar to ours in c AfroAsiaticLanguage(’Arabic’). the spirit. Furthermore, SW ING follows engineering principles ... because it promotes the reuse of existing systems (AL-Q U I N c IndoEuropeanLanguage(’Persian’). and Prot´ g´ -2000) and the adherence to standards (either e e ... normative - see OWL for the Semantic Web - or de facto - see c UralAltaicLanguage(’Kazak’). DATALOG for ILP). Finally the resulting artifact overcomes the ... capabilities of the two systems when considered stand-alone. c MonotheisticReligion(’ShiaMuslim’). In particular, AL-Q U I N was originally conceived to deal with c MonotheisticReligion(’SunniMuslim’). ALC ontologies. Since OWL is equivalent to SHIQ [26] and ... ALC is a fragment of SHIQ [21], the middleware SW ING c PolytheisticReligion(’Druze’). allows AL-Q U I N to deal with more expressive ontologies and ... to face Semantic Web applications. For the future we plan to extend SW ING with facilities for that are stored in the file cia exp1.abox 2. extracting information from semantic portals and for present- The file cia exp1.tbox contains a DATALOGOI rewriting ing patterns generated by AL-Q U I N. of the taxonomic relations of T such as: hierarchy(c Language,1,null,[c Language]). R EFERENCES hierarchy(c Religion,1,null,[c Religion]). [1] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Eds., 1 Advances in Knowledge Discovery and Data Mining. AAAI Press/The for the layer T and MIT Press, 1996. [2] F. Lisi and F. Esposito, “ILP Meets Knowledge Engineering: A Case hierarchy(c Language,2,c Language, Study.” in Inductive Logic Programming, ser. Lecture Notes in Artificial [c AfroAsiaticLanguage, c IndoEuropeanLanguage, . . .]). Intelligence, S. Kramer and B. Pfahringer, Eds. Springer, 2005, vol. hierarchy(c Religion,2,c Religion, 3625, pp. 209–226. [3] H. Mannila and H. Toivonen, “Levelwise search and borders of theories [c MonotheisticReligion, c PolytheisticReligion]). in knowledge discovery,” Data Mining and Knowledge Discovery, vol. 1, 2 no. 3, pp. 241–258, 1997. for the layer T and [4] F. Lisi and F. Esposito, “An ILP Perspective on the Semantic Web,” in Semantic Web Applications and Perspectives 2005, P. Bouquet and hierarchy(c Language,3,c AfroAsiaticLanguage, G. Tumarello, Eds. CEUR Workshop Proceedings, 2005, http://ceur- [c AfroAsiaticLanguage]). ws.org/Vol-166/.
  • 7. [5] F. Donini, M. Lenzerini, D. Nardi, and A. Schaerf, “AL-log: Integrating Datalog and Description Logics,” Journal of Intelligent Information Systems, vol. 10, no. 3, pp. 227–252, 1998. [6] S. Nienhuys-Cheng and R. de Wolf, Foundations of Inductive Logic Programming, ser. Lecture Notes in Artificial Intelligence. Springer, 1997, vol. 1228. [7] G. Stumme, A. Hotho, and B. Berendt, “Semantic Web Mining: State of the art and future directions,” Journal of Web Semantics, vol. 4, no. 2, pp. 124–143, 2006. [8] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, vol. May, 2001. [9] R. Kosala and H. Blockeel, “Web Mining Research: A Survey,” SIGKDD: SIGKDD Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining, ACM, vol. 2, 2000. [Online]. Available: citeseer.ist.psu.edu/kosala00web.html [10] A. Maedche and S. Staab, “Discovering Conceptual Relations from Text,” in Proceedings of the 14th European Conference on Artificial Intelligence, W. Horn, Ed. IOS Press, 2000, pp. 321–325. [11] A. G´ mez-P´ rez, M. Fern´ ndez-L´ pez, and O. Corcho, Ontological o e a o Engineering. Springer, 2004. [12] N. F. Noy, R. Fergerson, and M. Musen, “The Knowledge Model of Prot´ g´ -2000: Combining Interoperability and Flexibility.” in Knowledge e e Acquisition, Modeling and Management, ser. Lecture Notes in Computer Science, R. Dieng and O. Corby, Eds. Springer, 2000, vol. 1937, pp. 17–32. [13] F. Lisi and D. Malerba, “Inducing Multi-Level Association Rules from Multiple Relations,” Machine Learning, vol. 55, pp. 175–210, 2004. [14] J. Han and Y. Fu, “Mining multiple-level association rules in large databases,” IEEE Transactions on Knowledge and Data Engineering, vol. 11, no. 5, 1999. [15] M. Schmidt-Schauss and G. Smolka, “Attributive concept descriptions with complements,” Artificial Intelligence, vol. 48, no. 1, pp. 1–26, 1991. [16] S. Ceri, G. Gottlob, and L. Tanca, Logic Programming and Databases. Springer, 1990. [17] G. Semeraro, F. Esposito, D. Malerba, N. Fanizzi, and S. Ferilli, “A logic framework for the incremental inductive synthesis of Datalog the- ories,” in Proceedings of 7th International Workshop on Logic Program Synthesis and Transformation, ser. Lecture Notes in Computer Science, N. Fuchs, Ed. Springer, 1998, vol. 1463, pp. 300–321. [18] J. Gennari, M. Musen, R. Fergerson, W. Grosso, M. Crub´ zy, H. Eriks- e son, N. F. Noy, and S. W. Tu, “The evolution of Prot´ g´ : An environment e e for knowledge-based systems development.” International Journal of Human-Computer Studies, vol. 58, no. 1, pp. 89–123, 2003. [19] H. Knublauch, M. Musen, and A. Rector, “Editing Description Logic Ontologies with the Prot´ g´ OWL Plugin.” in Proceedings of the 2004 e e International Workshop on Description Logics (DL2004), ser. CEUR Workshop Proceedings, V. Haarslev and R. M¨ ller, Eds., vol. 104, 2004. o [20] V. Haarslev and R. M¨ ller, “Description of the RACER System and o its Applications.” in Working Notes of the 2001 International Descrip- tion Logics Workshop (DL-2001), ser. CEUR Workshop Proceedings, C. Goble, D. McGuinness, R. M¨ ller, and P. Patel-Schneider, Eds., o vol. 49, 2001. [21] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel- Schneider, Eds., The Description Logic Handbook: Theory, Implemen- tation and Applications. Cambridge University Press, 2003. [22] J. Kietz, “Learnability of description logic programs,” in Inductive Logic Programming, ser. Lecture Notes in Artificial Intelligence, S. Matwin and C. Sammut, Eds., vol. 2583. Springer, 2003, pp. 117–132. [23] A. Levy and M.-C. Rousset, “Combining Horn rules and description logics in CARIN,” Artificial Intelligence, vol. 104, pp. 165–209, 1998. [24] U. Hustadt, B. Motik, and U. Sattler, “Reducing SHIQ-description logic to disjunctive datalog programs.” in Principles of Knowledge Representation and Reasoning: Proceedings of the Ninth International Conference (KR2004), D. Dubois, C. Welty, and M.-A. Williams, Eds. AAAI Press, 2004, pp. 152–162. [25] U. Assmann, J. Henriksson, and J.Maluszynski, “Combining safe rules and ontologies by interfacing of reasoners.” in Principles and Practice of Semantic Web Reasoning, ser. Lecture Notes in Computer Science, J. Alferes, J. Bailey, W. May, and U. Schwertel, Eds. Springer, 2006, vol. 4187, pp. 33–47. [26] I. Horrocks, P. Patel-Schneider, and F. van Harmelen, “From SHIQ and RDF to OWL: The making of a web ontology language,” Journal of Web Semantics, vol. 1, no. 1, pp. 7–26, 2003.