SlideShare a Scribd company logo
1 of 34
Digital Enterprise Research Institute                                                                            www.deri.ie




                       Approximate Semantic Matching of
                            Heterogeneous Events
                          Souleiman Hasan, Sean O’Riain, Edward Curry
                   Digital Enterprise Research Institute (DERI), National University of Ireland, Galway (NUIG)


             In Proceedings of the 6th ACM International Conference on Distributed Event-Based
                                Systems (DEBS 2012), Berlin, Germany, 2012.




 Stefan.Decker@deri.org
 http://www.StefanDecker.org/

 Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
Outline
Digital Enterprise Research Institute                                     www.deri.ie




            Introduction                                Experiments
                   Smart Environments                       Wikipedia
                   Motivational Scenario                    Freebase

                   Related Work                       Conclusions
            Proposal                                  Q&A
                   Approximate Semantic Matching




                                            2 of 34
Smart Environments
Digital Enterprise Research Institute                                                     www.deri.ie




            Smart Homes, Grids, Cities…
            Internet-of-Things, Sensor Web…
       by 2020 50             billion devices connected to mobile networks (OECD, 2012)

        Non-technical users
        High heterogeneity
        Trend for dynamic data-driven decision making
                                                            Event/Situation of Interest
       Event/Situation of Interest                          Soccer match played in Berlin
       New free parking space near me
                                                             ........


                                                  3 of 34
Motivational Scenario- Enterprise
Digital Enterprise Research Institute                                   www.deri.ie
                                                       CIO
                                        CSO
   Situation of Interest
   Company CO2 emissions
   performance                                               Energy usage by
                                                             global IT
                                                             department
                                                      Helpdesk
  Various terms used:
  energy consumption,
  energy usage….                                             PUE of the
                                                             Data Center in
  room, space, zone…
                                                             Dublin
                                                      Maintenance Personnel

  Dynamic Environments:
  New events from                                            kWhs used by
  equipments joining and                                     server 172.16.0.8
  leaving

                                                             Building


                                                   Data Center



                                         4 of 34
Requirements
Digital Enterprise Research Institute                    www.deri.ie




            Handling of semantically heterogeneous events
            Handling of dynamic environments with event
             types by sources joining and leaving
            Low cost of rules management
            Usability
            Precision




                                        5 of 34
Event Processing
Digital Enterprise Research Institute                                                                                                  www.deri.ie


         Situation of Interest
         When a floor is empty and its energy usage for an hour is above
         threshold w.r.t budget then it is an excessive usage
                                                                                                                                User
                                         Translation
                                                                                                   Non-technical users with natural
                      Developer
                                                                                                          language needs
                                                                    CEP Engine                       Separated from the engine

                                                                                                                                 UI
        Rules tied to vocabulary
       EVENT PROCESSING RULE




                                                                   EPL Interface
                                                                                                         Rules




                                                                    and Parser




                                                                                                                    Repository
                                                                                                                    Execution
       INSERT INTO ExcessiveEnergyUsageByFloor                                     Pattern Matcher     Repository
          High cost in case of
       SELECT a.floor as floor
       FROM PATTERN
        heterogeneity or change
       [(a=FloorEmptySensor -> every b=DeviceEnergyUsageSensor
                                                                                    Single Event       Templates
       (a.floor=b.floor))]                                                            Matcher          Repository
       .WIN:TIME(1 hour)
       GROUP BY a.floor
       WHERE (b.usage) > GetAcceptableThreshold(a.budgetValue)                                                                        ERP
                     PC NO XDG26359
                     Floor: 1st
                     usage: 3 kWh

                             VM: vmdgsit01.deri.ie
                             Floor: 1st                                                                             BMS
                             usage: 15 kWh



                                                                 6 of 34
Exact Event Processing Paradigm
Digital Enterprise Research Institute                                          www.deri.ie




         Requirement                    Addressing by the paradigm
         Semantic Heterogeneity         Does not scale out to high
                                        heterogeneous environments
         Dynamic Environment            Does not scale out to high dynamic
                                        environments
         Rule Management                High cost on large heterogeneity and
                                        dynamicity
         Usability                      Low
         Precision                      100% (typically)




                                        7 of 34
Decoupling in Event Systems
Digital Enterprise Research Institute                                          www.deri.ie




            Space Producers and consumers don’t know each other
            Time Participants don’t need to be actively involved in the interaction th
               same time

            Synchronization        Event producers and consumers don’t get
               blocked to send/receive events
                                            Space


                                             Time
                    Event                                            Event
                   Producer                                         Consumer
                                        Synchronization




                                          8 of 34
Decoupling in Event Systems
Digital Enterprise Research Institute                                                    www.deri.ie




            Principle
                                   “Removal of explicit dependencies between participants”
                                                                      (Eugster et al., 2003)
            Outcome
                   Scalability
                                                      Space


                                                       Time
                    Event                                                      Event
                   Producer                                                   Consumer
                                                  Synchronization




                                                    9 of 34
Semantic Coupling
Digital Enterprise Research Institute                                                www.deri.ie




            Current event-based systems keep explicit semantic
             dependency between participants
            Limited scalability in highly heterogeneous and dynamic
             environment
                                                   Space


                                                    Time
                       Event                                               Event
                      Producer                 Synchronization            Consumer



                                                  Semantic
                                        (Event types, property, values)

                                                 10 of 34
Current Approaches
Digital Enterprise Research Institute                                               www.deri.ie




            Ontology-based
                   (Petrovic et al., 2003), (Zhang & Ye, 2008)…
                   Does not “remove explicit dependency”
                   Hard to achieve ontology agreement a priori at large-scale of
                    heterogeneity and dynamicism
                   Medium usability, 100% precision typically
            Fuzzy sets
                   (Liu & Jacobsen, 2002)
                   Address only event numerical values vs. string values
                    subscriptions
                   Medium usability, High precision



                                             11 of 34
Proposed Approach
Digital Enterprise Research Institute                                         www.deri.ie




            Approximate semantic matching of events
                                  Event                 Types & properties
                                 Type(s)                possible mappings
                                Properties
                                 Values

                              Subscription                Values possible
                                Type(s)                      mappings
                               Properties
                                Values
                                                         Pick best overall
                                                             mapping


                                                        Post-matching event
                                                            processing


                                             12 of 34
Background
Digital Enterprise Research Institute                                           www.deri.ie




            Semantic Similarity
                   f: Terms X Terms  [0,1]
                   term1, term2 are Terms
                        f(term1, term2)=0 absolute semantic mismatch
                        f(term1,term2)=1 exact match
                   E.g. Football Match and Soccer Match are similar
            Relatedness: a general case of similarity
                   E.g. Football Match and Referee related but not similar
            Thesaurus-based: e.g. WordNet-based
            Distributional semantics-based: e.g. Wikipedia ESA
                   The more Wikipedia articles two terms occurs in, the more
                    related they are

                                               13 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                                                 www.deri.ie


                                        Football Match                                      Types & properties
                                                                                            possible mappings
                                                                     2010 FIFA World
        Howard Webb
                                               type                     Cup Final
                                  referee               name                                  Values possible
                                                                                                mappings
     Spain National                            event
                                 team
     Football Team
                                                        team                                 Pick best overall
                                  location                           Netherlands National        mapping
                                             location                   Football Team
          Johannesburg
                                                                                            Post-matching event
                                            FNB stadium                                         processing



          Subscription
          Event           type “”Soccer Match
          Event           team     “Spain”
          Event           place    “South Africa”



                                                          14 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                                                          www.deri.ie


              Event                                              Subscription                        Types & properties
                                                                                                     possible mappings
                               type                              type
                              name                               place                                 Values possible
                            referee                              team                                    mappings
                              team
                           location                                                                   Pick best overall
                                                                                                          mapping
                             1
                           0.9                                                    Lin
                           0.8                                                                       Post-matching event
                           0.7                                                    Jiang&Conrath
                                                                                                         processing
               Precision




                           0.6                                                    Leacock&Chodorow
                           0.5
                           0.4                                                    Lesk
                           0.3
                                                                                  Path
                           0.2
                           0.1                                                    Resnik
                             0
                                 0   0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9   1      Gloss Vector

                                                   Recall                         WuPalmer




                                                                       15 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                              www.deri.ie


              Event                     Subscription                     Types & properties
                                                                         possible mappings
                       type             type
                      name              place                              Values possible
                    referee             team                                 mappings
                      team
                   location                                               Pick best overall
                                                                              mapping


    Determine top m correspondence candidates                            Post-matching event
    RankSimJiiang&Conrath(ps, pe)                                            processing


    Measure properties relatedness
    fP=Min(1,m-RankSimJiiang&Conrath(ps, pe) +1)*WikipediaESA(ps, pe))




                                           16 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                          www.deri.ie


              Event                           Subscription           Types & properties
                                                                     possible mappings
                       type                   type
                      name                    place                    Values possible
                    referee                   team                       mappings
                      team
                   location                                           Pick best overall
                                                                          mapping

                       type                   type           Top 1
                   location             90%   place                  Post-matching event
                                                                         processing
                      team                    team


                        type                  type           Top 2
                       name             40%   place
                     referee                  team




                                                 17 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                               www.deri.ie


              Event                                        Subscription   Types & properties
                                                                          possible mappings
                               Football Match              Soccer Match
                                Howard Webb
                Spain National Football Team               South Africa     Values possible
                                Johannesburg               Spain              mappings
                                 FNB stadium
           Netherlands National Football Team

                                                                           Pick best overall
                                                                               mapping

         Measure values relatedness fV=WikipediaESA(Vs, Ve)
                                                                          Post-matching event
                                                                              processing




                                                18 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                                 www.deri.ie


              Event                                          Subscription   Types & properties
                                                                            possible mappings
                               Football Match                Soccer Match
                                Howard Webb
                Spain National Football Team                 South Africa     Values possible
                                Johannesburg                 Spain              mappings
                                 FNB stadium
           Netherlands National Football Team

                                                                             Pick best overall
                                                                                 mapping

            Spain National Football             95%           Spain
                             Team                                           Post-matching event
                                                                                processing

                Netherlands National            30%           Spain
                      Football Team




                                                  19 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                                  www.deri.ie


                                Event                         Subscription   Types & properties
                                                                             possible mappings
                                            type              type
                                           name               place            Values possible
                                         referee              team               mappings
                                           team
                                        location                              Pick best overall
                                                                                  mapping
                               Football Match                 Soccer Match
                                Howard Webb
                Spain National Football Team                  South Africa   Post-matching event
                                Johannesburg                  Spain              processing
                                 FNB stadium
           Netherlands National Football Team



       Calculate statements relatedness
       fSTMT =fP(ps, pe)*fV(vs, ve)




                                                   20 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                                                  www.deri.ie


                                Event                         Subscription   Types & properties
                                                                             possible mappings
                                            type              type
                                           name               place            Values possible
                                         referee              team               mappings
                                           team
                                        location                              Pick best overall
                                                                                  mapping
                               Football Match                 Soccer Match
                                Howard Webb
                Spain National Football Team                  South Africa   Post-matching event
                                Johannesburg                  Spain              processing
                                 FNB stadium
           Netherlands National Football Team



       Determine correspondent event statement
       Corre by Max fSTMT




                                                   21 of 34
Proposed Approach Instantiation
Digital Enterprise Research Institute                        www.deri.ie


                                                   Types & properties
            Rank within a window                  possible mappings


            Complex Event Processing
                                                     Values possible
            …                                         mappings



                                                    Pick best overall
                                                        mapping



                                                   Post-matching event
                                                       processing




                                        22 of 34
Experiments Overview
Digital Enterprise Research Institute                                                 www.deri.ie




            Methodology
                   Prepare an event set that reflect required semantic heterogeneity
                    (Wikipedia events)
                   Prepare gold standard set of subscriptions that stress multiple
                    aspects of semantic coupling
                   Validate suitability of semantic approximation from precision
                    perspective
                   Use a different event set and same subscriptions to validate low
                    maintainability cost (Freebase events)
            Evaluation Criteria
                   Average interpolated Precision-Recall Curve on 11 recall points
                   Maximal F1 Score over the average curve


                                             23 of 34
Experiment 1- Wikipedia Events
Digital Enterprise Research Institute                                                                www.deri.ie


                                              Event Set Statistics
       Source                                                structured Wikipedia Infoboxes, DBpedia
                                                             31 August 2011
       Collection                                            Triples directly associated to instances of
                                                             dbpedia-owl:Event class
       Data model                                            RDF
       Total # of events                                     20,156
       Total # of distinct event types                       4,950
       Total # of distinct event properties                  1,459
       Total # of distinct event values                      500,717
       Total # of triples                                    1,502,599
       Average # of distinct type per event                  7.42
       Average # of distinct property per event              30.52
       Average # of distinct value per event                 54.16
       Average # of triple per event                         64.67


                                                  24 of 34
Experiment 1- Wikipedia Events
Digital Enterprise Research Institute                  www.deri.ie




            Example Event Types
                   Football Match
                   Race
                   Music Festival
                   Space Mission
                   Election
                   10th-Century BC Conflicts
                   Academic Conference
                   Aviation Accident
                   …




                                            25 of 34
Experiment 1- Subscription Set
Digital Enterprise Research Institute                                                                                                   www.deri.ie



            Manually created gold standard set of subscriptions
       ID   Description              Subscription                             # of       # of      Event type      Event           Literals and
                                                                              relevant   needed    approximation   properties      resources
                                                                              events     exact                     approximation   approximation
                                                                                         rules


       1    Football matches         event type "Football Match"              1          1              NO              NO              NO
            played by Spain in the   event team "Spain national football
            FNB stadium              team"
                                     event stadium "FNB Stadium"
       2    Football matches         event type "Football Match"              2          2              NO              YES             NO
            played in the FNB        event place "FNB Stadium"
            stadium
       3    Events taking place in   event type "Event"                       219        5              NO              YES           Syntactic
            Wembley stadium          event place "Wembley Stadium"
       4    Charity events taking    event type "Charity"                     29         6              YES             YES           Semantic
            place in Wembley         event place "Wembley Stadium"                                                                   + Syntactic
            stadium
       5    Charity Rock events      event type "Charity"                     2          2              YES             YES           Semantic
            taking place in          event type "Rock"                                                                               + Syntactic
            Wembley stadium          event place "Wembley Stadium"
       6    Football matches         event type "Football Match"              505        603            NO              YES          Background
            played in the UK         event stadium "United Kingdom"                                                                  Knowledge
       7    Football matches         event type "Football Match"              20         123,774        NO              YES          Background
            played by a South        event team "South America"                                                                      Knowledge
            American team in         event stadium "Europe"
            Europe




                                                                           26 of 34
Experiment 1- Subscription Set
Digital Enterprise Research Institute                                                                                                                     www.deri.ie




                                                                                                                                Event properties
               Manually created gold standard set of subscriptions




                                                                                                                approximation




                                                                                                                                approximation




                                                                                                                                                   approximation
                                                    Subscription




                                                                                  # of relevant




                                                                                                                                                   Literals and
                                                                                                  # of needed
                  Description


       ID       Description              Template                                 # of            # of          Event type      Event              Literals and




                                                                                                  exact rules
                                                                                                                Event type




                                                                                                                                                   resources
                                                                                  relevant        needed        approximation   properties         resources
                                                                                  events          exact                         approximation      approximation
                                                                                                  rules




                                                                                  events
       1        Football matches         event type "Football Match"              1               1                      NO                NO             NO
           ID




                played by Spain in the   event team "Spain national football
                FNB stadium              team"
                                         event stadium "FNB Stadium"
       3         Events taking                   event type                       219             5                    NO            YES           Syntactic
       2        Football matches       event type "Football Match"                2               2                      NO               YES             NO
                 place in Wembley place "FNB Stadium"
                played in the FNB      event
                                                 "Event"
                 stadium
                stadium                          event place
       3        Events taking place in event type "Event"                         219             5                      NO               YES          Syntactic
                Wembley stadium
                                                 "Wembley
                                       event place "Wembley Stadium"

       4        Charity events taking
                                                 Stadium"
                                       event type "Charity"                       29              6             YES             YES                Semantic
                place in Wembley         event place "Wembley Stadium"                                                                             + Syntactic
                stadium                  event type "Event"
       Subscription
       5        Charity Rock events      event place "Wembley Stadium"
                                         event type "Charity"           2    2       YES        YES        Semantic
                taking place in          event type "Rock"                                                 + Syntactic
                Wembley stadium          ?event rdf:type dbpedia-owl:Event.
                                         event place "Wembley Stadium"
       SPARQL pattern 1
       6        Football matches         ?event dbpprop:stadium
                                         event type "Football Match"    505 dbpedia:Wembley_Stadium.
                                                                             603     NO         YES        Background
                played in the UK         event stadium "United Kingdom"                                    Knowledge
                                         ?event rdf:type dbpedia-owl:Event.
       SPARQL pattern 2
       7  Football matches               event type "Football Match"    20   123,774 NO         YES        Background
                played by a South        ?event dbpedia-owl:location
                                         event team "South America"                     dbpedia:Wembley_Stadium.
                                                                                                           Knowledge
                American team in         event stadium "Europe"
       …        Europe                   …



                                                                               27 of 34
Experiment 1- Results
Digital Enterprise Research Institute                                                                                                                                        www.deri.ie


                          1
                        0.9
                        0.8
                        0.7
            Precision




                        0.6
                        0.5                                                                                  Events taking place in Wembley stadium
                        0.4
                        0.3                                  Need for a hybrid matcher that
                        0.2
                        0.1                                         combines both
                          0
                              0   0.1   0.2   0.3   0.4     0.5 0.6           0.7    0.8   0.9     1
                                                           Recall
                                                               45%
                                         Jiang&Conrath                 40%   Wikipedia ESA
                                                                       35%
                                                           Frequency




                                                                       30%
                                                                       25%                               1
                                                                       20%                             0.9
                                                               15%                                     0.8
                                                               10%                                     0.7




                                                                                                            Precision
                                                                                                       0.6
                                                                5%
                                                                                                       0.5
                                                          Football matches played in the UK
                                                                0%
                                                                                                       0.4
                                                                    0  2^ -25   2^ -20    2^ -15  2^ -10
                                                                                                       0.3 2^ -5              1
                                                                                                       0.2
                                                                        Semantic similarity or relatedness score
                                                                                                       0.1
                                                                                        (log scale)      0
                                                                                    Jiang&Conrath           WikipediaESA
                                                                                                                   0  0.1    0.2   0.3   0.4    0.5 0.6    0.7   0.8   0.9     1
                                                                                                                                               Recall
                                                                                                                              Jiang&Conrath               Wikipedia ESA




                                                                                                 28 of 34
Experiment 1- Results
Digital Enterprise Research Institute                                                                                         www.deri.ie




            Hybrid matcher outperforms a single similarity or
             relatedness measure matcher.
                     Matcher                                     Jiang&Conrath              Wikipedia ESA            Hybrid
              Maximal F1 Score                                       70.06%                    44.26%                75.45%
              Recall                                                  80%                        80%                  90%
              Precision                                              62.31%                    30.59%                64.94%
                                          1
                                        0.9
                                        0.8
                                        0.7
                            Precision




                                        0.6
                                        0.5
                                        0.4
                                        0.3
                                        0.2
                                        0.1
                                          0
                                              0   0.1      0.2    0.3   0.4     0.5   0.6   0.7   0.8     0.9    1
                                                                              Recall
                                                        Jiang&Conrath           Wikipedia ESA           Hybrid




                                                                         29 of 34
Experiment 2- Freebase Event Set
Digital Enterprise Research Institute                                                           www.deri.ie


                                            Event Set Statistics
     Source                                            Freebase events dump 1 December 2011,
                                                       triples current
     Collection                                        Triples directly associated to instances of
                                                       “fbase:time.event" class
     Data model                                        RDF
     Total # of events                                 84,529
     Total # of distinct event types                   858
     Total # of distinct event properties              1,242
     Total # of distinct event values                  1,199,627
     Total # of triples                                1,859,338
     Average # of distinct type per event              3.33
     Average # of distinct property per event          10.67
     Average # of distinct value per event             21.66
     Average # of triple per event                     21.99


                                                 30 of 34
Experiment 2- Subscription Set
Digital Enterprise Research Institute                                                                                                  www.deri.ie



            Same as in Experiment 1.
       ID   Description              Subscription                           # of       # of      Event type      Event           Literals and
                                                                            relevant   needed    approximation   properties      resources
                                                                            events     exact                     approximation   approximation
                                                                                       rules


       1    Football matches         event type "Football Match"            1          1              YES             YES             NO
            played by Spain in the   event team "Spain national football
            FNB stadium              team"
                                     event stadium "FNB Stadium"
       2    Football matches         event type "Football Match"            8          2              YES             YES             NO
            played in the FNB        event place "FNB Stadium"
            stadium
       3    Events taking place in   event type "Event"                     29         5              NO              YES             NO
            Wembley stadium          event place "Wembley Stadium"
       4    Charity events taking    event type "Charity"                   0          -                -               -               -
            place in Wembley         event place "Wembley Stadium"
            stadium
       5    Charity Rock events      event type "Charity"                   0          -                -               -               -
            taking place in          event type "Rock"
            Wembley stadium          event place "Wembley Stadium"
       6    Football matches         event type "Football Match"            34         1,398          YES             YES          Background
            played in the UK         event stadium "United Kingdom"                                                                Knowledge
       7    Football matches         event type "Football Match"            2          219,600        YES             YES          Background
            played by a South        event team "South America"                                                                    Knowledge
            American team in         event stadium "Europe"
            Europe




                                                                           31 of 34
Experiment 2- Results
Digital Enterprise Research Institute                                                                                           www.deri.ie




            Hybrid matcher gives similar results in Freebase as in
             DBpedia
                     Matcher                                  Jiang&Conrath                 Wikipedia ESA              Hybrid
              Maximal F1 Score                                    44.60%                       70.73%                  76.33%
              Recall                                               60%                           80%                    80%
              Precision                                           35.49%                       63.39%                  72.98%
                                              1
                                            0.9
                                            0.8
                                            0.7
                                Precision




                                            0.6
                                            0.5
                                            0.4
                                            0.3
                                            0.2
                                            0.1
                                              0
                                                  0   0.1      0.2   0.3      0.4     0.5   0.6   0.7   0.8      0.9    1
                                                                                    Recall
                                                            Jiang&Conrath             Wikipedia ESA           Hybrid




                                                                           32 of 34
Conclusions
Digital Enterprise Research Institute                                           www.deri.ie




            Approximate semantic matcher addresses subscriptions/
             rules maintainability cost in heterogeneous and dynamic
             environments
            Approximate semantic matcher is suitable when less than
             100% precision is acceptable
                                                               Approximate Semantic
                                              Exact Matcher
                                                                     Matcher
       Number of Required Subscriptions              345,000            7
       Maximal F1-Score                               100%           75.89%

            A hybrid matcher outperforms a single similarity or
             relatedness measure matcher.



                                          33 of 34
Future Work
Digital Enterprise Research Institute                             www.deri.ie




            Need to enhance subscription set for more
             representativeness.
            Approximate semantic matcher generates “uncertain”
             results whose impacts on further event processing
             functions such as CEP needs to be studied




                                        34 of 34

More Related Content

Similar to Approximate Semantic Matching of Heterogeneous Events

Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsEdward Curry
 
Capturing Interactive Data Transformation Operations using Provenance Workflows
Capturing Interactive Data Transformation Operations using Provenance WorkflowsCapturing Interactive Data Transformation Operations using Provenance Workflows
Capturing Interactive Data Transformation Operations using Provenance WorkflowsAndre Freitas
 
Omitola o rian_eswc_idts final
Omitola o rian_eswc_idts finalOmitola o rian_eswc_idts final
Omitola o rian_eswc_idts finalTope Omitola
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Benjamin Heitmann
 
Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web appsLessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web appsBenjamin Heitmann
 
Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012 Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012 Fujitsu Global
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersEdward Curry
 
Implementing Semantic Web applications: reference architecture and challenges
Implementing Semantic Web applications:  reference architecture and challengesImplementing Semantic Web applications:  reference architecture and challenges
Implementing Semantic Web applications: reference architecture and challengesBenjamin Heitmann
 
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...Jyothi Satyanathan
 
DashMash: a Mashup Environment for End User Development
DashMash: a Mashup Environment for End User DevelopmentDashMash: a Mashup Environment for End User Development
DashMash: a Mashup Environment for End User DevelopmentMatteo Picozzi
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREBin Cheng
 
FIWARE Tech Summit - FogFlow - New GE for IoT Edge Computing
FIWARE Tech Summit - FogFlow - New GE for IoT Edge ComputingFIWARE Tech Summit - FogFlow - New GE for IoT Edge Computing
FIWARE Tech Summit - FogFlow - New GE for IoT Edge ComputingFIWARE
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Artneilgrindley
 
One-stop shop for software development information
One-stop shop for software development informationOne-stop shop for software development information
One-stop shop for software development informationAftab Iqbal
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataEdward Curry
 
Peter Coffee CIO Forum 20100406
Peter Coffee CIO Forum 20100406Peter Coffee CIO Forum 20100406
Peter Coffee CIO Forum 20100406Peter Coffee
 
eNovance Make Your Cloud
eNovance Make Your CloudeNovance Make Your Cloud
eNovance Make Your CloudeNovance
 
Where does it go from here? The role of software in digital repositories
Where does it go from here? The role of software in digital repositoriesWhere does it go from here? The role of software in digital repositories
Where does it go from here? The role of software in digital repositoriesNeil Chue Hong
 
The Enterprise Cloud: Immediate. Urgent. Inevitable.
The Enterprise Cloud: Immediate. Urgent. Inevitable.The Enterprise Cloud: Immediate. Urgent. Inevitable.
The Enterprise Cloud: Immediate. Urgent. Inevitable.Peter Coffee
 
Alleantia LeWeb Paris 2012 Startup Comp preliminary
Alleantia LeWeb Paris 2012 Startup Comp preliminaryAlleantia LeWeb Paris 2012 Startup Comp preliminary
Alleantia LeWeb Paris 2012 Startup Comp preliminaryAntonio Conati Barbaro
 

Similar to Approximate Semantic Matching of Heterogeneous Events (20)

Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous Events
 
Capturing Interactive Data Transformation Operations using Provenance Workflows
Capturing Interactive Data Transformation Operations using Provenance WorkflowsCapturing Interactive Data Transformation Operations using Provenance Workflows
Capturing Interactive Data Transformation Operations using Provenance Workflows
 
Omitola o rian_eswc_idts final
Omitola o rian_eswc_idts finalOmitola o rian_eswc_idts final
Omitola o rian_eswc_idts final
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...
 
Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web appsLessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web apps
 
Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012 Fujitsu keynote at Oracle OpenWorld 2012
Fujitsu keynote at Oracle OpenWorld 2012
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing Consumers
 
Implementing Semantic Web applications: reference architecture and challenges
Implementing Semantic Web applications:  reference architecture and challengesImplementing Semantic Web applications:  reference architecture and challenges
Implementing Semantic Web applications: reference architecture and challenges
 
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...
Smarter Storage in the Smarter Computing Era - A New Approach to Storage - Ak...
 
DashMash: a Mashup Environment for End User Development
DashMash: a Mashup Environment for End User DevelopmentDashMash: a Mashup Environment for End User Development
DashMash: a Mashup Environment for End User Development
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWARE
 
FIWARE Tech Summit - FogFlow - New GE for IoT Edge Computing
FIWARE Tech Summit - FogFlow - New GE for IoT Edge ComputingFIWARE Tech Summit - FogFlow - New GE for IoT Edge Computing
FIWARE Tech Summit - FogFlow - New GE for IoT Edge Computing
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Art
 
One-stop shop for software development information
One-stop shop for software development informationOne-stop shop for software development information
One-stop shop for software development information
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked Data
 
Peter Coffee CIO Forum 20100406
Peter Coffee CIO Forum 20100406Peter Coffee CIO Forum 20100406
Peter Coffee CIO Forum 20100406
 
eNovance Make Your Cloud
eNovance Make Your CloudeNovance Make Your Cloud
eNovance Make Your Cloud
 
Where does it go from here? The role of software in digital repositories
Where does it go from here? The role of software in digital repositoriesWhere does it go from here? The role of software in digital repositories
Where does it go from here? The role of software in digital repositories
 
The Enterprise Cloud: Immediate. Urgent. Inevitable.
The Enterprise Cloud: Immediate. Urgent. Inevitable.The Enterprise Cloud: Immediate. Urgent. Inevitable.
The Enterprise Cloud: Immediate. Urgent. Inevitable.
 
Alleantia LeWeb Paris 2012 Startup Comp preliminary
Alleantia LeWeb Paris 2012 Startup Comp preliminaryAlleantia LeWeb Paris 2012 Startup Comp preliminary
Alleantia LeWeb Paris 2012 Startup Comp preliminary
 

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Approximate Semantic Matching of Heterogeneous Events

  • 1. Digital Enterprise Research Institute www.deri.ie Approximate Semantic Matching of Heterogeneous Events Souleiman Hasan, Sean O’Riain, Edward Curry Digital Enterprise Research Institute (DERI), National University of Ireland, Galway (NUIG) In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (DEBS 2012), Berlin, Germany, 2012. Stefan.Decker@deri.org http://www.StefanDecker.org/ Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
  • 2. Outline Digital Enterprise Research Institute www.deri.ie  Introduction  Experiments  Smart Environments  Wikipedia  Motivational Scenario  Freebase  Related Work  Conclusions  Proposal  Q&A  Approximate Semantic Matching 2 of 34
  • 3. Smart Environments Digital Enterprise Research Institute www.deri.ie  Smart Homes, Grids, Cities…  Internet-of-Things, Sensor Web… by 2020 50 billion devices connected to mobile networks (OECD, 2012)  Non-technical users  High heterogeneity  Trend for dynamic data-driven decision making Event/Situation of Interest Event/Situation of Interest Soccer match played in Berlin New free parking space near me ........ 3 of 34
  • 4. Motivational Scenario- Enterprise Digital Enterprise Research Institute www.deri.ie CIO CSO Situation of Interest Company CO2 emissions performance Energy usage by global IT department Helpdesk Various terms used: energy consumption, energy usage…. PUE of the Data Center in room, space, zone… Dublin Maintenance Personnel Dynamic Environments: New events from kWhs used by equipments joining and server 172.16.0.8 leaving Building Data Center 4 of 34
  • 5. Requirements Digital Enterprise Research Institute www.deri.ie  Handling of semantically heterogeneous events  Handling of dynamic environments with event types by sources joining and leaving  Low cost of rules management  Usability  Precision 5 of 34
  • 6. Event Processing Digital Enterprise Research Institute www.deri.ie Situation of Interest When a floor is empty and its energy usage for an hour is above threshold w.r.t budget then it is an excessive usage User Translation Non-technical users with natural Developer language needs CEP Engine Separated from the engine UI Rules tied to vocabulary EVENT PROCESSING RULE EPL Interface Rules and Parser Repository Execution INSERT INTO ExcessiveEnergyUsageByFloor Pattern Matcher Repository High cost in case of SELECT a.floor as floor FROM PATTERN heterogeneity or change [(a=FloorEmptySensor -> every b=DeviceEnergyUsageSensor Single Event Templates (a.floor=b.floor))] Matcher Repository .WIN:TIME(1 hour) GROUP BY a.floor WHERE (b.usage) > GetAcceptableThreshold(a.budgetValue) ERP PC NO XDG26359 Floor: 1st usage: 3 kWh VM: vmdgsit01.deri.ie Floor: 1st BMS usage: 15 kWh 6 of 34
  • 7. Exact Event Processing Paradigm Digital Enterprise Research Institute www.deri.ie Requirement Addressing by the paradigm Semantic Heterogeneity Does not scale out to high heterogeneous environments Dynamic Environment Does not scale out to high dynamic environments Rule Management High cost on large heterogeneity and dynamicity Usability Low Precision 100% (typically) 7 of 34
  • 8. Decoupling in Event Systems Digital Enterprise Research Institute www.deri.ie  Space Producers and consumers don’t know each other  Time Participants don’t need to be actively involved in the interaction th same time  Synchronization Event producers and consumers don’t get blocked to send/receive events Space Time Event Event Producer Consumer Synchronization 8 of 34
  • 9. Decoupling in Event Systems Digital Enterprise Research Institute www.deri.ie  Principle  “Removal of explicit dependencies between participants” (Eugster et al., 2003)  Outcome  Scalability Space Time Event Event Producer Consumer Synchronization 9 of 34
  • 10. Semantic Coupling Digital Enterprise Research Institute www.deri.ie  Current event-based systems keep explicit semantic dependency between participants  Limited scalability in highly heterogeneous and dynamic environment Space Time Event Event Producer Synchronization Consumer Semantic (Event types, property, values) 10 of 34
  • 11. Current Approaches Digital Enterprise Research Institute www.deri.ie  Ontology-based  (Petrovic et al., 2003), (Zhang & Ye, 2008)…  Does not “remove explicit dependency”  Hard to achieve ontology agreement a priori at large-scale of heterogeneity and dynamicism  Medium usability, 100% precision typically  Fuzzy sets  (Liu & Jacobsen, 2002)  Address only event numerical values vs. string values subscriptions  Medium usability, High precision 11 of 34
  • 12. Proposed Approach Digital Enterprise Research Institute www.deri.ie  Approximate semantic matching of events Event Types & properties Type(s) possible mappings Properties Values Subscription Values possible Type(s) mappings Properties Values Pick best overall mapping Post-matching event processing 12 of 34
  • 13. Background Digital Enterprise Research Institute www.deri.ie  Semantic Similarity  f: Terms X Terms  [0,1]  term1, term2 are Terms  f(term1, term2)=0 absolute semantic mismatch  f(term1,term2)=1 exact match  E.g. Football Match and Soccer Match are similar  Relatedness: a general case of similarity  E.g. Football Match and Referee related but not similar  Thesaurus-based: e.g. WordNet-based  Distributional semantics-based: e.g. Wikipedia ESA  The more Wikipedia articles two terms occurs in, the more related they are 13 of 34
  • 14. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Football Match Types & properties possible mappings 2010 FIFA World Howard Webb type Cup Final referee name Values possible mappings Spain National event team Football Team team Pick best overall location Netherlands National mapping location Football Team Johannesburg Post-matching event FNB stadium processing Subscription Event type “”Soccer Match Event team “Spain” Event place “South Africa” 14 of 34
  • 15. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping 1 0.9 Lin 0.8 Post-matching event 0.7 Jiang&Conrath processing Precision 0.6 Leacock&Chodorow 0.5 0.4 Lesk 0.3 Path 0.2 0.1 Resnik 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gloss Vector Recall WuPalmer 15 of 34
  • 16. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Determine top m correspondence candidates Post-matching event RankSimJiiang&Conrath(ps, pe) processing Measure properties relatedness fP=Min(1,m-RankSimJiiang&Conrath(ps, pe) +1)*WikipediaESA(ps, pe)) 16 of 34
  • 17. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping type type Top 1 location 90% place Post-matching event processing team team type type Top 2 name 40% place referee team 17 of 34
  • 18. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings Football Match Soccer Match Howard Webb Spain National Football Team South Africa Values possible Johannesburg Spain mappings FNB stadium Netherlands National Football Team Pick best overall mapping Measure values relatedness fV=WikipediaESA(Vs, Ve) Post-matching event processing 18 of 34
  • 19. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings Football Match Soccer Match Howard Webb Spain National Football Team South Africa Values possible Johannesburg Spain mappings FNB stadium Netherlands National Football Team Pick best overall mapping Spain National Football 95% Spain Team Post-matching event processing Netherlands National 30% Spain Football Team 19 of 34
  • 20. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Football Match Soccer Match Howard Webb Spain National Football Team South Africa Post-matching event Johannesburg Spain processing FNB stadium Netherlands National Football Team Calculate statements relatedness fSTMT =fP(ps, pe)*fV(vs, ve) 20 of 34
  • 21. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Football Match Soccer Match Howard Webb Spain National Football Team South Africa Post-matching event Johannesburg Spain processing FNB stadium Netherlands National Football Team Determine correspondent event statement Corre by Max fSTMT 21 of 34
  • 22. Proposed Approach Instantiation Digital Enterprise Research Institute www.deri.ie Types & properties  Rank within a window possible mappings  Complex Event Processing Values possible  … mappings Pick best overall mapping Post-matching event processing 22 of 34
  • 23. Experiments Overview Digital Enterprise Research Institute www.deri.ie  Methodology  Prepare an event set that reflect required semantic heterogeneity (Wikipedia events)  Prepare gold standard set of subscriptions that stress multiple aspects of semantic coupling  Validate suitability of semantic approximation from precision perspective  Use a different event set and same subscriptions to validate low maintainability cost (Freebase events)  Evaluation Criteria  Average interpolated Precision-Recall Curve on 11 recall points  Maximal F1 Score over the average curve 23 of 34
  • 24. Experiment 1- Wikipedia Events Digital Enterprise Research Institute www.deri.ie Event Set Statistics Source structured Wikipedia Infoboxes, DBpedia 31 August 2011 Collection Triples directly associated to instances of dbpedia-owl:Event class Data model RDF Total # of events 20,156 Total # of distinct event types 4,950 Total # of distinct event properties 1,459 Total # of distinct event values 500,717 Total # of triples 1,502,599 Average # of distinct type per event 7.42 Average # of distinct property per event 30.52 Average # of distinct value per event 54.16 Average # of triple per event 64.67 24 of 34
  • 25. Experiment 1- Wikipedia Events Digital Enterprise Research Institute www.deri.ie  Example Event Types  Football Match  Race  Music Festival  Space Mission  Election  10th-Century BC Conflicts  Academic Conference  Aviation Accident  … 25 of 34
  • 26. Experiment 1- Subscription Set Digital Enterprise Research Institute www.deri.ie  Manually created gold standard set of subscriptions ID Description Subscription # of # of Event type Event Literals and relevant needed approximation properties resources events exact approximation approximation rules 1 Football matches event type "Football Match" 1 1 NO NO NO played by Spain in the event team "Spain national football FNB stadium team" event stadium "FNB Stadium" 2 Football matches event type "Football Match" 2 2 NO YES NO played in the FNB event place "FNB Stadium" stadium 3 Events taking place in event type "Event" 219 5 NO YES Syntactic Wembley stadium event place "Wembley Stadium" 4 Charity events taking event type "Charity" 29 6 YES YES Semantic place in Wembley event place "Wembley Stadium" + Syntactic stadium 5 Charity Rock events event type "Charity" 2 2 YES YES Semantic taking place in event type "Rock" + Syntactic Wembley stadium event place "Wembley Stadium" 6 Football matches event type "Football Match" 505 603 NO YES Background played in the UK event stadium "United Kingdom" Knowledge 7 Football matches event type "Football Match" 20 123,774 NO YES Background played by a South event team "South America" Knowledge American team in event stadium "Europe" Europe 26 of 34
  • 27. Experiment 1- Subscription Set Digital Enterprise Research Institute www.deri.ie Event properties  Manually created gold standard set of subscriptions approximation approximation approximation Subscription # of relevant Literals and # of needed Description ID Description Template # of # of Event type Event Literals and exact rules Event type resources relevant needed approximation properties resources events exact approximation approximation rules events 1 Football matches event type "Football Match" 1 1 NO NO NO ID played by Spain in the event team "Spain national football FNB stadium team" event stadium "FNB Stadium" 3 Events taking event type 219 5 NO YES Syntactic 2 Football matches event type "Football Match" 2 2 NO YES NO place in Wembley place "FNB Stadium" played in the FNB event "Event" stadium stadium event place 3 Events taking place in event type "Event" 219 5 NO YES Syntactic Wembley stadium "Wembley event place "Wembley Stadium" 4 Charity events taking Stadium" event type "Charity" 29 6 YES YES Semantic place in Wembley event place "Wembley Stadium" + Syntactic stadium event type "Event" Subscription 5 Charity Rock events event place "Wembley Stadium" event type "Charity" 2 2 YES YES Semantic taking place in event type "Rock" + Syntactic Wembley stadium ?event rdf:type dbpedia-owl:Event. event place "Wembley Stadium" SPARQL pattern 1 6 Football matches ?event dbpprop:stadium event type "Football Match" 505 dbpedia:Wembley_Stadium. 603 NO YES Background played in the UK event stadium "United Kingdom" Knowledge ?event rdf:type dbpedia-owl:Event. SPARQL pattern 2 7 Football matches event type "Football Match" 20 123,774 NO YES Background played by a South ?event dbpedia-owl:location event team "South America" dbpedia:Wembley_Stadium. Knowledge American team in event stadium "Europe" … Europe … 27 of 34
  • 28. Experiment 1- Results Digital Enterprise Research Institute www.deri.ie 1 0.9 0.8 0.7 Precision 0.6 0.5 Events taking place in Wembley stadium 0.4 0.3 Need for a hybrid matcher that 0.2 0.1 combines both 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall 45% Jiang&Conrath 40% Wikipedia ESA 35% Frequency 30% 25% 1 20% 0.9 15% 0.8 10% 0.7 Precision 0.6 5% 0.5 Football matches played in the UK 0% 0.4 0 2^ -25 2^ -20 2^ -15 2^ -10 0.3 2^ -5 1 0.2 Semantic similarity or relatedness score 0.1 (log scale) 0 Jiang&Conrath WikipediaESA 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Jiang&Conrath Wikipedia ESA 28 of 34
  • 29. Experiment 1- Results Digital Enterprise Research Institute www.deri.ie  Hybrid matcher outperforms a single similarity or relatedness measure matcher. Matcher Jiang&Conrath Wikipedia ESA Hybrid Maximal F1 Score 70.06% 44.26% 75.45% Recall 80% 80% 90% Precision 62.31% 30.59% 64.94% 1 0.9 0.8 0.7 Precision 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Jiang&Conrath Wikipedia ESA Hybrid 29 of 34
  • 30. Experiment 2- Freebase Event Set Digital Enterprise Research Institute www.deri.ie Event Set Statistics Source Freebase events dump 1 December 2011, triples current Collection Triples directly associated to instances of “fbase:time.event" class Data model RDF Total # of events 84,529 Total # of distinct event types 858 Total # of distinct event properties 1,242 Total # of distinct event values 1,199,627 Total # of triples 1,859,338 Average # of distinct type per event 3.33 Average # of distinct property per event 10.67 Average # of distinct value per event 21.66 Average # of triple per event 21.99 30 of 34
  • 31. Experiment 2- Subscription Set Digital Enterprise Research Institute www.deri.ie  Same as in Experiment 1. ID Description Subscription # of # of Event type Event Literals and relevant needed approximation properties resources events exact approximation approximation rules 1 Football matches event type "Football Match" 1 1 YES YES NO played by Spain in the event team "Spain national football FNB stadium team" event stadium "FNB Stadium" 2 Football matches event type "Football Match" 8 2 YES YES NO played in the FNB event place "FNB Stadium" stadium 3 Events taking place in event type "Event" 29 5 NO YES NO Wembley stadium event place "Wembley Stadium" 4 Charity events taking event type "Charity" 0 - - - - place in Wembley event place "Wembley Stadium" stadium 5 Charity Rock events event type "Charity" 0 - - - - taking place in event type "Rock" Wembley stadium event place "Wembley Stadium" 6 Football matches event type "Football Match" 34 1,398 YES YES Background played in the UK event stadium "United Kingdom" Knowledge 7 Football matches event type "Football Match" 2 219,600 YES YES Background played by a South event team "South America" Knowledge American team in event stadium "Europe" Europe 31 of 34
  • 32. Experiment 2- Results Digital Enterprise Research Institute www.deri.ie  Hybrid matcher gives similar results in Freebase as in DBpedia Matcher Jiang&Conrath Wikipedia ESA Hybrid Maximal F1 Score 44.60% 70.73% 76.33% Recall 60% 80% 80% Precision 35.49% 63.39% 72.98% 1 0.9 0.8 0.7 Precision 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Jiang&Conrath Wikipedia ESA Hybrid 32 of 34
  • 33. Conclusions Digital Enterprise Research Institute www.deri.ie  Approximate semantic matcher addresses subscriptions/ rules maintainability cost in heterogeneous and dynamic environments  Approximate semantic matcher is suitable when less than 100% precision is acceptable Approximate Semantic Exact Matcher Matcher Number of Required Subscriptions 345,000 7 Maximal F1-Score 100% 75.89%  A hybrid matcher outperforms a single similarity or relatedness measure matcher. 33 of 34
  • 34. Future Work Digital Enterprise Research Institute www.deri.ie  Need to enhance subscription set for more representativeness.  Approximate semantic matcher generates “uncertain” results whose impacts on further event processing functions such as CEP needs to be studied 34 of 34