SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
大学共同利用機関法人 情報・システム研究機構
   国立情報学研究所
   National Institute of Informatics




Mid-Ontology Learning from Linked Data

Lihua Zhao and Ryutaro Ichise
JIST2011, 12.05.2011, Hangzhou
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Outline


   Introduction

   Mid-Ontology Learning Approach

   Experimental Evaluation

   Related Work

   Conclusion and Future Work



       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 2
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Introduction
   Linked Open Data
      295 data sets, 31 billion RDF triples (as of Sep. 2011)
      7 domains (cross-domain, geographic, media, life sciences,
      government, user-generated content, and publications)
      Interlinked Instances (owl:sameAs)




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 3
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Introduction


   Challenging Problem
      Each data set has specific ontology schema
                  DBpedia: http://dbpedia.org/property/population
                  Geonames: http://www.geonames.org/ontology#population
           Time-consuming to learn all the ontology schema
                  DBpedia: 320 classes and thousands of properties.
           Heterogeneity of ontology schema
                  http://dbpedia.org/property/populationTotal
                  http://dbpedia.org/property/population




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 4
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Introduction

   Objective

   Collected data based on “http://dbpedia.org/resource/Berlin”.
    Predicate                                                Object
    http : //dbpedia.org /property /name                     Berlin
    http : //dbpedia.org /property /population               3439100
    http : //dbpedia.org /property /plz                      10001-14199
    http : //dbpedia.org /ontology /postalCode               10001-14199
    http : //dbpedia.org /ontology /populationTotal          3439100
    ......                                                   ......
    http : //www .geonames.org /ontology #alternateName      Berlin
    http : //www .geonames.org /ontology #alternateName      Berlyn@af
    http : //www .geonames.org /ontology #population         3426354
    ......                                                   ......
    http : //www .w 3.org /2004/02/skos/core#prefLabel       Berlin (Germany)
    http : //data.nytimes.com/elements/first use              2004-09-12
    http : //data.nytimes.com/elements/latest use            2010-06-13


       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 5
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Introduction



   Simple ontology for various data sets: Mid-Ontology
      Investigation on linked instances
                  owl:sameAs links identical or related instances
                  Scale down the data set
           Automatic ontology learning
                  Integrate ontologies from diverse domain data sets
                  Automate the ontology construction process
                  Adapt to linked open data sets




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 6
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Mid-Ontology Learning Approach




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 7
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Data Collection

   We scale down the data sets by collecting only linked instances,
   from which we can extract related information.
           Extract data linked with owl:sameAs
                  Select a core data set (inward & outward links)
                  Collect all instances that have owl:sameAs
           Remove noisy instances of the core data set
                  Noisy instances: without any meaningful triple
           Collect predicates and objects
                  collect <predicate, object> (PO) pairs from collected instances
                  collect PO pairs from linked instances (other data sets)




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 8
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  An Example of Collected Data
           dbpedia:Berlin owl:sameAs http://sws.geonames.org/2950159/
           http://data.nytimes.com/N50987186835223032381 owl:sameAs dbpedia:Berlin


   Collected data based on “http://dbpedia.org/resource/Berlin”.
    Predicate                                                Object
    http : //dbpedia.org /property /name                     Berlin
    http : //dbpedia.org /property /population               3439100
    http : //dbpedia.org /property /plz                      10001-14199
    http : //dbpedia.org /ontology /postalCode               10001-14199
    http : //dbpedia.org /ontology /populationTotal          3439100
    ......                                                   ......
    http : //www .geonames.org /ontology #alternateName      Berlin
    http : //www .geonames.org /ontology #alternateName      Berlyn@af
    http : //www .geonames.org /ontology #population         3426354
    ......                                                   ......
    http : //www .w 3.org /2004/02/skos/core#prefLabel       Berlin (Germany)
    http : //data.nytimes.com/elements/first use              2004-09-12
    http : //data.nytimes.com/elements/latest use            2010-06-13

       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 9
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Mid-Ontology Learning Approach




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 10
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Predicate Grouping



   Grouping related predicates from different ontology schema,
   because many similar or related predicates actually refer to the
   same thing.
           Group predicates by exact matching
           Prune groups by similarity matching
           Refine groups using extracted relations




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 11
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Predicate Grouping


   Grouping related predicates from different ontology schema,
   because many similar or related predicates actually refer to the
   same thing.
           Group predicates by exact matching
                  One predicate may have various objects
                  Different predicates may have the same object value
           Prune groups by similarity matching
           Refine groups using extracted relations




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 12
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Group Predicates by Exact Matching
   Create initial groups (Gi ) of PO pairs
   e.g. Gi .predicates = { db-prop:name, geo-onto:alternateName }
        Gi .objects    = { Berlin, Berlyn@af }
   Collected data based on “http://dbpedia.org/resource/Berlin”.
    Predicate                                                Object
    http : //dbpedia.org /property /name                     Berlin
    http : //dbpedia.org /property /population               3439100
    http : //dbpedia.org /property /plz                      10001-14199
    http : //dbpedia.org /ontology /postalCode               10001-14199
    http : //dbpedia.org /ontology /populationTotal          3439100
    ......                                                   ......
    http : //www .geonames.org /ontology #alternateName      Berlin
    http : //www .geonames.org /ontology #alternateName      Berlyn@af
    http : //www .geonames.org /ontology #population         3426354
    ......                                                   ......
    http : //www .w 3.org /2004/02/skos/core#prefLabel       Berlin (Germany)
    http : //data.nytimes.com/elements/first use              2004-09-12
    http : //data.nytimes.com/elements/latest use            2010-06-13

       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 13
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Predicate Grouping


   Grouping related predicates from different ontology schema,
   because many similar or related predicates actually refer to the
   same thing.
           Group predicates by exact matching
           Prune groups by similarity matching
           Exact matching may ignore
                  Terms of predicates or objects written in different languages
                  Semantically identical or related predicates
           Refine groups using extracted relations



       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 14
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Prune Groups by Similarity Matching


   Ontology similarity matching at the concept level
      String-based similarity measure: StrSim(O(Gi ), O(Gj ))
                  O(Gi ): objects in Gi
                  Prefix, Suffix, Levenshtein distance, and n-gram.
           Knowledge-based similarity measure: WNSim(T (Gi ), T (Gj ))
                  T (Gi ): pre-processed terms of predicates in Gi
                  Natural Language Processing: tokenizing terms, removing stop words,
                  and stemming.
                  WordNet-based similarity measures: LCH, RES, HSO, JCN, LESK,
                  PATH, WUP, LIN, and VECTOR




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 15
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach           Experimental Evaluation    Related Work     Conclusion and Future Work


  Prune Groups by Similarity Matching


   Similarity between initial groups {G1 , G2 , . . . Gk }

                                           StrSim(O(Gi ), O(Gj )) + WNSim(T (Gi ), T (Gj ))
      Sim(Gi , Gj ) =
                                                                  2
   Prune initial groups Gi
           If Sim(Gi , Gj ) is higher than the predefined similarity threshold, we
           merge Gi and Gj .
           If an initial group Gi has not been merged and has only one PO
           pair, we remove Gi .




       大学共同利用機関法人 情報・システム研究機構                Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 16
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach              Experimental Evaluation     Related Work    Conclusion and Future Work


  An Example of Similarity Calculation

                    Group             Predicate                                                              Object
                    Gi                http : //dbpedia.org /property /population                             3439100
                                      http : //dbpedia.org /ontology /populationTotal                        3439100
                    Gj                http : //www .geonames.org /ontology #population                       3426354


   Example of String-based similarity measures on pairwise objects.
     Pairwise Objects                                   prefix          suffix               Levenshtein distance               n-gram
     “3439100”, “3426354”                                0.29             0                                  0                  0.29

   Example of WordNet-based similarity measures on pairwise terms.
     Pairwise Terms                        LCH   RES       HSO      JCN      LESK     PATH       WUP       LIN   VECTOR
     population, population                  1      1           1      1         1           1       1       1           1
     population, total                     0.4      0           0   0.06      0.03        0.11    0.33       0        0.06



                                                           0.145 + 0.5825
                                 Sim(Gi , Gj ) =                          = 0.36375
                                                                  2

       大学共同利用機関法人 情報・システム研究機構                Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 17
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Predicate Grouping


   Grouping related predicates from different ontology schema,
   because many similar or related predicates actually refer to the
   same thing.
           Group predicates by exact matching
           Prune groups by similarity matching
           Refine groups using extracted relations
                  Divide pruned groups according to rdfs:domain and rdfs:range.
                  Keep groups with high frequency




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 18
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Mid-Ontology Learning Approach




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 19
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Mid-Ontology Construction
   Select terms for Mid-Ontology
           Collect all the terms of predicates in each refined group Gi .
           Collect all the pre-processed terms of P(Gi ) (predicates in Gi ).
           Choose one term, which has the highest frequency and longest
           term.
           e.g. “area” and “areaCode” are totally different
   Construct Relations
           mo-prop:hasMembers to link Mid-Ontology classes and integrated
           predicates
   Construct Mid-Ontology
           Automatically construct Mid-Ontology using selected terms and
           mo-prop:hasMembers.

       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 20
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Experimental Evaluation



   Evaluate the Mid-Ontology approach from four different aspects:
           Evaluation of Data Reduction
           Evaluation of Ontology Quality
           Evaluation with A SPARQL Example
           Analysis of Mid-Ontology Approach




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 21
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Implementation

   Environment
           Linux Ubuntu 10.10, 16GB Memory, 1 TB Disk
           Core i7 CPU 880 3.07GHz
           Java, Netbeans 6.9
   Virtuoso
           High-performance server for RDF storage
           SPARQL query endpoint
   WordNet::Similarity
           Implemented in Perl
           Knowledge-based similarity measures


       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 22
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Experimental Data
           DBpedia: cross-domain, 3.5 million things, 8.9 million URIs
           Geonames: geographical domain, 7 million URIs
           NYTimes: media domain, 10,467 subject news




   Choose DBpedia as the core data set, because of its wealth of inward
   and outward links to other data sets.
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 23
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Evaluation of Data Reduction
   Evaluate the effectiveness of data reduction during the data
   collection phase by comparing the number of instances.
   Number of distinct instances during data collection phase.
    Data set     Before reduction       owl:sameAs retrieval                                Noisy data removal
    DBpedia              8,955,728          135,749 (1.52%)                                    88,506 (0.99%)
    Geonames             7,479,714          128,961 (1.72%)                                    82,054 (1.10%)
    NYTimes                 10,467           9,226 (88.14%)                                    8,535 (81.54%)

   Evaluation Analysis
      The data sets are dramatically scaled down by keeping only
      linked instances that share related information.
      Successfully removed noisy instances, which may affect the
      quality of the Mid-Ontology.
      e.g. Removed instances with only db-prop:hasPhotosCollection
      (broken link) and owl:sameAs link.
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 24
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Evaluation of Ontology Quality
   Evaluate the quality of Mid-Ontology by validating whether
   predicates in each class share related information.

   Accuracy of Mid-Ontology
                                                           n   |Correct Predicates in Ci |
                                                           i=1            |Ci |
                                ACC (MO) =
                                                                            n
   n: the number of classes
   |Ci |: the number of predicates in class Ci .

   Cardinality

                                                          |Number of Predicates|
                                    Cardinality =
                                                            |Number of Classes|
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 25
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Evaluation of Ontology Quality
   Improvement achieved by our approach
           MO no p r: with exact matching (without the pruning and
           refining processes)
           MO: with both pruning and refining processes

        MO                         Number of Classes          Number of Predicates           Cardinality      Accuracy
        MO no p r                  11                         300                            27.27            68.78%
        MO                         29                         180                            6.21             90.10%

   Evaluation Analysis
      Significantly improved the accuracy
      Decreased the cardinality (Less number of predicates and more
      classes)
      Successfully removed unrelated predicates
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 26
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Evaluation with A SPARQL Example


   Evaluate the effectiveness of information retrieval with the
   Mid-Ontology constructed with our approach.


   Predicates grouped in mo-onto:population.
    <rdf:Description rdf:about=“mid-onto:population”>
    <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/population”/>
    <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/popLatest”/>
    <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/populationTotal”/>
    <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/ontology/populationTotal”/>
    <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/einwohner”/>
    <mo-prop:hasMembers rdf:resource=“http://www.geonames.org/ontology#population”/>
    </rdf:Description>




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 27
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Evaluation with A SPARQL Example
   SPARQL: Find places with a population of more than 10 million.
    SELECT DISTINCT ?places
    WHERE{ mid-onto:population mo-prop:hasMembers ?prop.
              ?places ?prop ?population.
              FILTER (xsd:integer(?population) > 10000000). }

     Single property for population                                             Number of Results
     http://dbpedia.org/property/population                                          177
     http://dbpedia.org/property/popLatest                                             1
     http://dbpedia.org/property/populationTotal                                     107
     http://dbpedia.org/ontology/populationTotal                                     129
     http://dbpedia.org/property/einwohner                                             1
     http://www.geonames.org/ontology#population                                     244


   Evaluation Analysis
      Find 517 places with mid-onto:population.
      Less results with each single predicate under the same
      condition.
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 28
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Analysis of Mid-Ontology Approach
   Analyze whether we can successfully identify how data sets are
   connected.
   Sample classes in the Mid-Ontology
     DBpedia                               DBpedia & Geonames                DBpedia & Geonames & NYTimes
     mo-onto:birthdate                     mo-onto:population                mo-onto:name
     mo-onto:deathdate                     mo-onto:prominence                mo-onto:long
     mo-onto:motto                         mo-onto:postal


   Evaluation Analysis
      Predicates in DBpedia are heterogeneous.
      Linked instances between DBpedia and Geonames are about
      places.
      Linked instances among DBpedia, Geonames, and NYTimes
      are about events, persons, or places.
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 29
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


    Possible Application




   Find missing owl:sameAs links
      e.g. Find missing owl:sameAs link with mo-onto:population
           http://dbpedia.org/resource/Cyclades                     db-prop:population     “119549”
           http://dbpedia.org/resource/Cyclades                     db-prop:name           “Cyclades”
           http://sws.geonames.org/259819/                          geo-onto:population    “119549”
           http://sws.geonames.org/259819/                          geo-onto:alternateName “Cyclades”




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 30
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Possible Application



   Find missing owl:sameAs links
      e.g. Find missing owl:sameAs link with mo-onto:population
           http://dbpedia.org/resource/Cyclades                     db-prop:population     “119549”
           http://dbpedia.org/resource/Cyclades                     db-prop:name           “Cyclades”
           http://sws.geonames.org/259819/                          geo-onto:population    “119549”
           http://sws.geonames.org/259819/                          geo-onto:alternateName “Cyclades”
           Add owl:sameAs link
           http://dbpedia.org/resource/Cyclades owl:sameAs http://sws.geonames.org/259819/
           http://sws.geonames.org/259819/ owl:sameAs http://dbpedia.org/resource/Cyclades




       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 31
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Related Work


           Construct intermediate-layer ontology from geospatial, zoology,
           and genetics data resources. [Parundekar, et al.,2010]
                  Limited to a specific domain
           Construct intermediate-level ontology by enriching upper
           ontology (by adding new classes and properties). [Damova, et
           al., 2010]
                  Still too large
           Analysis of basic properties of SameAs network,
           Pay-Level-Domain network and Class-Level Similarity network.
           [Ding, et al., 2010]
                  Only frequent types are considered to analyze how data are connected



       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 32
       国立情報学研究所
       National Institute of Informatics
Introduction        Mid-Ontology Learning Approach         Experimental Evaluation    Related Work     Conclusion and Future Work


  Conclusion and Future Work
   Conclusion
           Learning heterogeneous ontology schema in the linked open
           data sets is not feasible.
           An automatic Mid-Ontology learning approach can solve the
           heterogeneity problem by integrating related predicates.
           The Mid-Ontology has a high accuracy, and effective to search
           from various data sets.
           A simple Mid-Ontology can be constructed without learning
           the entire ontology schema.
   Future Work
      Billion Triple Challenge (BTC) data set
      Crawl links at two or three depths without a core data set
       大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 33
       国立情報学研究所
       National Institute of Informatics
Questions?
                                     Lihua Zhao, lihua@nii.ac.jp
                                    Ryutaro Ichise, ichise@nii.ac.jp




大学共同利用機関法人 情報・システム研究機構              Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 34
国立情報学研究所
National Institute of Informatics

Más contenido relacionado

La actualidad más candente

schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies Simon Jupp
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data World
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinSimon Jupp
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppSimon Jupp
 
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Agnieszka Ławrynowicz
 
Ontology based clustering algorithms
Ontology based clustering algorithmsOntology based clustering algorithms
Ontology based clustering algorithmsIkutwa
 
RuleML2015: FOWLA, a federated architecture for ontologies
RuleML2015: FOWLA, a federated architecture for ontologiesRuleML2015: FOWLA, a federated architecture for ontologies
RuleML2015: FOWLA, a federated architecture for ontologiesRuleML
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISimon Jupp
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-rankingFELIX75
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
OntoMath digital ecosystem
OntoMath digital ecosystemOntoMath digital ecosystem
OntoMath digital ecosystemAlik Kirillovich
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningPaul Groth
 
Building a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jBuilding a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jSimon Jupp
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchChristoph Lange
 
A Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and DocumentationA Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and DocumentationChristoph Lange
 
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...AIST
 
Semantic Meta-Mining of Knowledge Discovery Processes
Semantic Meta-Mining of Knowledge Discovery ProcessesSemantic Meta-Mining of Knowledge Discovery Processes
Semantic Meta-Mining of Knowledge Discovery ProcessesAgnieszka Ławrynowicz
 

La actualidad más candente (19)

schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
Using Substitutive Itemset Mining Framework for Finding Synonymous Properties...
 
Ontology based clustering algorithms
Ontology based clustering algorithmsOntology based clustering algorithms
Ontology based clustering algorithms
 
RuleML2015: FOWLA, a federated architecture for ontologies
RuleML2015: FOWLA, a federated architecture for ontologiesRuleML2015: FOWLA, a federated architecture for ontologies
RuleML2015: FOWLA, a federated architecture for ontologies
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
OntoMath digital ecosystem
OntoMath digital ecosystemOntoMath digital ecosystem
OntoMath digital ecosystem
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Building a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jBuilding a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4j
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
 
A Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and DocumentationA Mathematical Approach to Ontology Authoring and Documentation
A Mathematical Approach to Ontology Authoring and Documentation
 
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...
Konstantin Vorontsov - BigARTM: Open Source Library for Regularized Multimoda...
 
Semantic Meta-Mining of Knowledge Discovery Processes
Semantic Meta-Mining of Knowledge Discovery ProcessesSemantic Meta-Mining of Knowledge Discovery Processes
Semantic Meta-Mining of Knowledge Discovery Processes
 

Similar a Mid-Ontology Learning from Linked Data @JIST2011

Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Open Web Data for Education - Linked Data technologies for connecting open ed...
Open Web Data for Education - Linked Data technologies for connecting open ed...Open Web Data for Education - Linked Data technologies for connecting open ed...
Open Web Data for Education - Linked Data technologies for connecting open ed...Mathieu d'Aquin
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyFAIRDOM
 
Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementNeuroMat
 
Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Philipp Zumstein
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMMathieu d'Aquin
 
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...Using Linked Data in Learning Analytics tutorial - Introduction and basics of...
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...Mathieu d'Aquin
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)butest
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生ysuzuki-naist
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...Neuroscience Information Framework
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 

Similar a Mid-Ontology Learning from Linked Data @JIST2011 (20)

Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Open Web Data for Education - Linked Data technologies for connecting open ed...
Open Web Data for Education - Linked Data technologies for connecting open ed...Open Web Data for Education - Linked Data technologies for connecting open ed...
Open Web Data for Education - Linked Data technologies for connecting open ed...
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
CV
CVCV
CV
 
Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow Management
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOM
 
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...Using Linked Data in Learning Analytics tutorial - Introduction and basics of...
Using Linked Data in Learning Analytics tutorial - Introduction and basics of...
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 

Último

Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Mid-Ontology Learning from Linked Data @JIST2011

  • 1. 大学共同利用機関法人 情報・システム研究機構 国立情報学研究所 National Institute of Informatics Mid-Ontology Learning from Linked Data Lihua Zhao and Ryutaro Ichise JIST2011, 12.05.2011, Hangzhou
  • 2. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Outline Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 2 国立情報学研究所 National Institute of Informatics
  • 3. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Introduction Linked Open Data 295 data sets, 31 billion RDF triples (as of Sep. 2011) 7 domains (cross-domain, geographic, media, life sciences, government, user-generated content, and publications) Interlinked Instances (owl:sameAs) 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 3 国立情報学研究所 National Institute of Informatics
  • 4. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Introduction Challenging Problem Each data set has specific ontology schema DBpedia: http://dbpedia.org/property/population Geonames: http://www.geonames.org/ontology#population Time-consuming to learn all the ontology schema DBpedia: 320 classes and thousands of properties. Heterogeneity of ontology schema http://dbpedia.org/property/populationTotal http://dbpedia.org/property/population 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 4 国立情報学研究所 National Institute of Informatics
  • 5. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Introduction Objective Collected data based on “http://dbpedia.org/resource/Berlin”. Predicate Object http : //dbpedia.org /property /name Berlin http : //dbpedia.org /property /population 3439100 http : //dbpedia.org /property /plz 10001-14199 http : //dbpedia.org /ontology /postalCode 10001-14199 http : //dbpedia.org /ontology /populationTotal 3439100 ...... ...... http : //www .geonames.org /ontology #alternateName Berlin http : //www .geonames.org /ontology #alternateName Berlyn@af http : //www .geonames.org /ontology #population 3426354 ...... ...... http : //www .w 3.org /2004/02/skos/core#prefLabel Berlin (Germany) http : //data.nytimes.com/elements/first use 2004-09-12 http : //data.nytimes.com/elements/latest use 2010-06-13 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 5 国立情報学研究所 National Institute of Informatics
  • 6. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Introduction Simple ontology for various data sets: Mid-Ontology Investigation on linked instances owl:sameAs links identical or related instances Scale down the data set Automatic ontology learning Integrate ontologies from diverse domain data sets Automate the ontology construction process Adapt to linked open data sets 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 6 国立情報学研究所 National Institute of Informatics
  • 7. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Mid-Ontology Learning Approach 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 7 国立情報学研究所 National Institute of Informatics
  • 8. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Data Collection We scale down the data sets by collecting only linked instances, from which we can extract related information. Extract data linked with owl:sameAs Select a core data set (inward & outward links) Collect all instances that have owl:sameAs Remove noisy instances of the core data set Noisy instances: without any meaningful triple Collect predicates and objects collect <predicate, object> (PO) pairs from collected instances collect PO pairs from linked instances (other data sets) 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 8 国立情報学研究所 National Institute of Informatics
  • 9. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work An Example of Collected Data dbpedia:Berlin owl:sameAs http://sws.geonames.org/2950159/ http://data.nytimes.com/N50987186835223032381 owl:sameAs dbpedia:Berlin Collected data based on “http://dbpedia.org/resource/Berlin”. Predicate Object http : //dbpedia.org /property /name Berlin http : //dbpedia.org /property /population 3439100 http : //dbpedia.org /property /plz 10001-14199 http : //dbpedia.org /ontology /postalCode 10001-14199 http : //dbpedia.org /ontology /populationTotal 3439100 ...... ...... http : //www .geonames.org /ontology #alternateName Berlin http : //www .geonames.org /ontology #alternateName Berlyn@af http : //www .geonames.org /ontology #population 3426354 ...... ...... http : //www .w 3.org /2004/02/skos/core#prefLabel Berlin (Germany) http : //data.nytimes.com/elements/first use 2004-09-12 http : //data.nytimes.com/elements/latest use 2010-06-13 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 9 国立情報学研究所 National Institute of Informatics
  • 10. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Mid-Ontology Learning Approach 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 10 国立情報学研究所 National Institute of Informatics
  • 11. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Predicate Grouping Grouping related predicates from different ontology schema, because many similar or related predicates actually refer to the same thing. Group predicates by exact matching Prune groups by similarity matching Refine groups using extracted relations 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 11 国立情報学研究所 National Institute of Informatics
  • 12. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Predicate Grouping Grouping related predicates from different ontology schema, because many similar or related predicates actually refer to the same thing. Group predicates by exact matching One predicate may have various objects Different predicates may have the same object value Prune groups by similarity matching Refine groups using extracted relations 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 12 国立情報学研究所 National Institute of Informatics
  • 13. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Group Predicates by Exact Matching Create initial groups (Gi ) of PO pairs e.g. Gi .predicates = { db-prop:name, geo-onto:alternateName } Gi .objects = { Berlin, Berlyn@af } Collected data based on “http://dbpedia.org/resource/Berlin”. Predicate Object http : //dbpedia.org /property /name Berlin http : //dbpedia.org /property /population 3439100 http : //dbpedia.org /property /plz 10001-14199 http : //dbpedia.org /ontology /postalCode 10001-14199 http : //dbpedia.org /ontology /populationTotal 3439100 ...... ...... http : //www .geonames.org /ontology #alternateName Berlin http : //www .geonames.org /ontology #alternateName Berlyn@af http : //www .geonames.org /ontology #population 3426354 ...... ...... http : //www .w 3.org /2004/02/skos/core#prefLabel Berlin (Germany) http : //data.nytimes.com/elements/first use 2004-09-12 http : //data.nytimes.com/elements/latest use 2010-06-13 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 13 国立情報学研究所 National Institute of Informatics
  • 14. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Predicate Grouping Grouping related predicates from different ontology schema, because many similar or related predicates actually refer to the same thing. Group predicates by exact matching Prune groups by similarity matching Exact matching may ignore Terms of predicates or objects written in different languages Semantically identical or related predicates Refine groups using extracted relations 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 14 国立情報学研究所 National Institute of Informatics
  • 15. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Prune Groups by Similarity Matching Ontology similarity matching at the concept level String-based similarity measure: StrSim(O(Gi ), O(Gj )) O(Gi ): objects in Gi Prefix, Suffix, Levenshtein distance, and n-gram. Knowledge-based similarity measure: WNSim(T (Gi ), T (Gj )) T (Gi ): pre-processed terms of predicates in Gi Natural Language Processing: tokenizing terms, removing stop words, and stemming. WordNet-based similarity measures: LCH, RES, HSO, JCN, LESK, PATH, WUP, LIN, and VECTOR 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 15 国立情報学研究所 National Institute of Informatics
  • 16. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Prune Groups by Similarity Matching Similarity between initial groups {G1 , G2 , . . . Gk } StrSim(O(Gi ), O(Gj )) + WNSim(T (Gi ), T (Gj )) Sim(Gi , Gj ) = 2 Prune initial groups Gi If Sim(Gi , Gj ) is higher than the predefined similarity threshold, we merge Gi and Gj . If an initial group Gi has not been merged and has only one PO pair, we remove Gi . 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 16 国立情報学研究所 National Institute of Informatics
  • 17. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work An Example of Similarity Calculation Group Predicate Object Gi http : //dbpedia.org /property /population 3439100 http : //dbpedia.org /ontology /populationTotal 3439100 Gj http : //www .geonames.org /ontology #population 3426354 Example of String-based similarity measures on pairwise objects. Pairwise Objects prefix suffix Levenshtein distance n-gram “3439100”, “3426354” 0.29 0 0 0.29 Example of WordNet-based similarity measures on pairwise terms. Pairwise Terms LCH RES HSO JCN LESK PATH WUP LIN VECTOR population, population 1 1 1 1 1 1 1 1 1 population, total 0.4 0 0 0.06 0.03 0.11 0.33 0 0.06 0.145 + 0.5825 Sim(Gi , Gj ) = = 0.36375 2 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 17 国立情報学研究所 National Institute of Informatics
  • 18. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Predicate Grouping Grouping related predicates from different ontology schema, because many similar or related predicates actually refer to the same thing. Group predicates by exact matching Prune groups by similarity matching Refine groups using extracted relations Divide pruned groups according to rdfs:domain and rdfs:range. Keep groups with high frequency 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 18 国立情報学研究所 National Institute of Informatics
  • 19. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Mid-Ontology Learning Approach 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 19 国立情報学研究所 National Institute of Informatics
  • 20. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Mid-Ontology Construction Select terms for Mid-Ontology Collect all the terms of predicates in each refined group Gi . Collect all the pre-processed terms of P(Gi ) (predicates in Gi ). Choose one term, which has the highest frequency and longest term. e.g. “area” and “areaCode” are totally different Construct Relations mo-prop:hasMembers to link Mid-Ontology classes and integrated predicates Construct Mid-Ontology Automatically construct Mid-Ontology using selected terms and mo-prop:hasMembers. 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 20 国立情報学研究所 National Institute of Informatics
  • 21. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Experimental Evaluation Evaluate the Mid-Ontology approach from four different aspects: Evaluation of Data Reduction Evaluation of Ontology Quality Evaluation with A SPARQL Example Analysis of Mid-Ontology Approach 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 21 国立情報学研究所 National Institute of Informatics
  • 22. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Implementation Environment Linux Ubuntu 10.10, 16GB Memory, 1 TB Disk Core i7 CPU 880 3.07GHz Java, Netbeans 6.9 Virtuoso High-performance server for RDF storage SPARQL query endpoint WordNet::Similarity Implemented in Perl Knowledge-based similarity measures 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 22 国立情報学研究所 National Institute of Informatics
  • 23. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Experimental Data DBpedia: cross-domain, 3.5 million things, 8.9 million URIs Geonames: geographical domain, 7 million URIs NYTimes: media domain, 10,467 subject news Choose DBpedia as the core data set, because of its wealth of inward and outward links to other data sets. 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 23 国立情報学研究所 National Institute of Informatics
  • 24. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Evaluation of Data Reduction Evaluate the effectiveness of data reduction during the data collection phase by comparing the number of instances. Number of distinct instances during data collection phase. Data set Before reduction owl:sameAs retrieval Noisy data removal DBpedia 8,955,728 135,749 (1.52%) 88,506 (0.99%) Geonames 7,479,714 128,961 (1.72%) 82,054 (1.10%) NYTimes 10,467 9,226 (88.14%) 8,535 (81.54%) Evaluation Analysis The data sets are dramatically scaled down by keeping only linked instances that share related information. Successfully removed noisy instances, which may affect the quality of the Mid-Ontology. e.g. Removed instances with only db-prop:hasPhotosCollection (broken link) and owl:sameAs link. 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 24 国立情報学研究所 National Institute of Informatics
  • 25. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Evaluation of Ontology Quality Evaluate the quality of Mid-Ontology by validating whether predicates in each class share related information. Accuracy of Mid-Ontology n |Correct Predicates in Ci | i=1 |Ci | ACC (MO) = n n: the number of classes |Ci |: the number of predicates in class Ci . Cardinality |Number of Predicates| Cardinality = |Number of Classes| 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 25 国立情報学研究所 National Institute of Informatics
  • 26. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Evaluation of Ontology Quality Improvement achieved by our approach MO no p r: with exact matching (without the pruning and refining processes) MO: with both pruning and refining processes MO Number of Classes Number of Predicates Cardinality Accuracy MO no p r 11 300 27.27 68.78% MO 29 180 6.21 90.10% Evaluation Analysis Significantly improved the accuracy Decreased the cardinality (Less number of predicates and more classes) Successfully removed unrelated predicates 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 26 国立情報学研究所 National Institute of Informatics
  • 27. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Evaluation with A SPARQL Example Evaluate the effectiveness of information retrieval with the Mid-Ontology constructed with our approach. Predicates grouped in mo-onto:population. <rdf:Description rdf:about=“mid-onto:population”> <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/population”/> <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/popLatest”/> <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/populationTotal”/> <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/ontology/populationTotal”/> <mo-prop:hasMembers rdf:resource=“http://dbpedia.org/property/einwohner”/> <mo-prop:hasMembers rdf:resource=“http://www.geonames.org/ontology#population”/> </rdf:Description> 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 27 国立情報学研究所 National Institute of Informatics
  • 28. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Evaluation with A SPARQL Example SPARQL: Find places with a population of more than 10 million. SELECT DISTINCT ?places WHERE{ mid-onto:population mo-prop:hasMembers ?prop. ?places ?prop ?population. FILTER (xsd:integer(?population) > 10000000). } Single property for population Number of Results http://dbpedia.org/property/population 177 http://dbpedia.org/property/popLatest 1 http://dbpedia.org/property/populationTotal 107 http://dbpedia.org/ontology/populationTotal 129 http://dbpedia.org/property/einwohner 1 http://www.geonames.org/ontology#population 244 Evaluation Analysis Find 517 places with mid-onto:population. Less results with each single predicate under the same condition. 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 28 国立情報学研究所 National Institute of Informatics
  • 29. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Analysis of Mid-Ontology Approach Analyze whether we can successfully identify how data sets are connected. Sample classes in the Mid-Ontology DBpedia DBpedia & Geonames DBpedia & Geonames & NYTimes mo-onto:birthdate mo-onto:population mo-onto:name mo-onto:deathdate mo-onto:prominence mo-onto:long mo-onto:motto mo-onto:postal Evaluation Analysis Predicates in DBpedia are heterogeneous. Linked instances between DBpedia and Geonames are about places. Linked instances among DBpedia, Geonames, and NYTimes are about events, persons, or places. 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 29 国立情報学研究所 National Institute of Informatics
  • 30. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Possible Application Find missing owl:sameAs links e.g. Find missing owl:sameAs link with mo-onto:population http://dbpedia.org/resource/Cyclades db-prop:population “119549” http://dbpedia.org/resource/Cyclades db-prop:name “Cyclades” http://sws.geonames.org/259819/ geo-onto:population “119549” http://sws.geonames.org/259819/ geo-onto:alternateName “Cyclades” 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 30 国立情報学研究所 National Institute of Informatics
  • 31. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Possible Application Find missing owl:sameAs links e.g. Find missing owl:sameAs link with mo-onto:population http://dbpedia.org/resource/Cyclades db-prop:population “119549” http://dbpedia.org/resource/Cyclades db-prop:name “Cyclades” http://sws.geonames.org/259819/ geo-onto:population “119549” http://sws.geonames.org/259819/ geo-onto:alternateName “Cyclades” Add owl:sameAs link http://dbpedia.org/resource/Cyclades owl:sameAs http://sws.geonames.org/259819/ http://sws.geonames.org/259819/ owl:sameAs http://dbpedia.org/resource/Cyclades 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 31 国立情報学研究所 National Institute of Informatics
  • 32. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Related Work Construct intermediate-layer ontology from geospatial, zoology, and genetics data resources. [Parundekar, et al.,2010] Limited to a specific domain Construct intermediate-level ontology by enriching upper ontology (by adding new classes and properties). [Damova, et al., 2010] Still too large Analysis of basic properties of SameAs network, Pay-Level-Domain network and Class-Level Similarity network. [Ding, et al., 2010] Only frequent types are considered to analyze how data are connected 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 32 国立情報学研究所 National Institute of Informatics
  • 33. Introduction Mid-Ontology Learning Approach Experimental Evaluation Related Work Conclusion and Future Work Conclusion and Future Work Conclusion Learning heterogeneous ontology schema in the linked open data sets is not feasible. An automatic Mid-Ontology learning approach can solve the heterogeneity problem by integrating related predicates. The Mid-Ontology has a high accuracy, and effective to search from various data sets. A simple Mid-Ontology can be constructed without learning the entire ontology schema. Future Work Billion Triple Challenge (BTC) data set Crawl links at two or three depths without a core data set 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 33 国立情報学研究所 National Institute of Informatics
  • 34. Questions? Lihua Zhao, lihua@nii.ac.jp Ryutaro Ichise, ichise@nii.ac.jp 大学共同利用機関法人 情報・システム研究機構 Lihua Zhao and Ryutaro Ichise | Mid-Ontology Learning from Linked Data | 34 国立情報学研究所 National Institute of Informatics