SlideShare una empresa de Scribd logo
1 de 68
Descargar para leer sin conexión
The Impact of
       Data Caching on
Query Execution for Linked Data

                                          Olaf Hartig
                          http://olafhartig.de/foaf.rdf#olaf
                                                @olafhartig

    Database and Information Systems Research Group
                       Humboldt-Universität zu Berlin
Can we query the Web of Data
 as of it were a single,
 giant database?


                    SELECT DISTINCT ?i ?label
                    WHERE {

                     ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ;
                          foaf:topic_interest ?i .




                    }
                     OPTIONAL {


                     }
                       ?i rdfs:label ?label
                       FILTER( LANG(?label)="en" || LANG(?label)="")


                    ORDER BY ?label
                                                                        ?




 Our approach: Link Traversal Based Query Execution
                                                 [ISWC'09]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data       2
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset




                                                                              query-local
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 3
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 4
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             5
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             6
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             7
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
                  “Descriptor object”
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             8
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 9
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                  http://bob.name
                                         Query                                kno
                                                                                 ws
         http://bob.name
                                                                                      http://alice.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                      query-local
                project           ?prj
                                                                                dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                               10
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                   ?acq
 ●   We alternate between:
                                                                                             http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                  http://bob.name
                                         Query                                kno
                                                                                 ws
         http://bob.name
                                                                                      http://alice.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                      query-local
                project           ?prj
                                                                                dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                      11
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         12
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         13
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         14
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                  ?acq
 ●   We alternate between:
                                                                                            http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     15
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                  ?acq
 ●   We alternate between:
                                                                                            http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     16
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                        http://alice.name
                                         Query                                   pr o
         http://bob.name                                                                jec
                                                                                              t
                                     ?prjName                                                     http://.../AlicesPrj
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                       query-local
                project           ?prj
                                                                                 dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                              17
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                              ?acq
 ●   We alternate between:
                                                                                                       http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                       ?prj
                                                                              http://alice.name        http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset
                                                                        http://alice.name
                                         Query                                      pr o
         http://bob.name                                                                   jec
                                                                                                 t
                                     ?prjName                                                        http://.../AlicesPrj
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                   18
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                         ?acq
 ●   We alternate between:
                                                                                                  http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                 ?prj
                                                                              http://alice.name   http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                              19
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                            ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                    ?prj
                                                                              http://alice.name     http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset                                                          ?prj           ?prjName
                                                                                     http://.../AlicesPrj   “…“
                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                20
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                            ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                    ?prj
                                                                              http://alice.name     http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset                                                          ?prj           ?prjName
                                                                                     http://.../AlicesPrj   “…“
                                         Query                       ?acq                    ?prj           ?prjName
         http://bob.name
                                     ?prjName               http://alice.name        http://.../AlicesPrj   “…“
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                21
Characteristics
 ●   Link traversal based query execution:
     ●   Evaluation on a continuously augmented dataset
     ●   Discovery of potentially relevant data during execution
     ●   Discovery driven by intermediate solutions

 ●   Main advantage:
     ●   No need to know all data sources in advance

 ●   Limitations:
     ●   Query has to contain a URI as a starting point
     ●
         Ignores data that is not reachable* by the query execution
                                                                    *
                                                                        formal definition in [LDOW'11a]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                          22
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                              query-local
                                                                               dataset




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 23
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                             htt                              query-local
                                                 p:   //b
                                                          ob                   dataset
                                                        ? .nam
                                                              e




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 24
The Issue
 Query
                 ?acq interest                                         http://bob.name
                                              ?i
                                                                                kno
               s
            ow


                                                                                   w   s



                                          label
          kn




                                                                                           http://alice.name
    http://bob.name
                                       ?iLabel

                                                                                  query-local
                                                                                   dataset




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                    25
The Issue
 Query
                 ?acq interest                                         http://bob.name
                                              ?i
                                                                                   kno
               s
            ow


                                                                                      w   s



                                          label
          kn




                                                                                              http://alice.name
    http://bob.name
                                       ?iLabel

                                                                                    query-local
                                                                                     dataset



                                                   ?acq                       ?i          ?iLabel




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                       26
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                                 query-local
                                                                                  dataset


                                         Query
       http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                       query-local
               project            ?prj
                                                                               dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                    27
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                                 query-local
                                                                                  dataset


                                         Query
       http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                       query-local
               project            ?prj
                                                                               dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                    28
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel


                                                                                            http://alice.name

                                                                                         o ws
                                         Query                                         kn
       http://bob.name
                                                                          http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                         query-local
               project            ?prj
                                                                                 dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     29
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i                                                     ?acq
               s
            ow



                                                                                               http://alice.name




                                          label
          kn




    http://bob.name
                                       ?iLabel


                                                                                            http://alice.name

                                                                                         o ws
                                         Query                                         kn
       http://bob.name
                                                                          http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                         query-local
               project            ?prj
                                                                                 dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        30
Hypothesis




        Re-using the query-local dataset (a.k.a. data caching)
                            may benefit
             query performance + result completeness




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   31
Contributions
 ●   Systematic analysis of the impact of data caching
     ●
         Theoretical foundation*
     ●
         Conceptual analysis*
     ●   Empirical evaluation of the potential impact

                                                                              *
                                                                                  see [LDOW'11a]




 ●   Out of scope: Caching strategies (replacement, invalidation)


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                        32
Experiment – Scenario



                                                                              ●   Information about the
                                                                                  distributed social
                                                                                  network of FOAF
                                                                                  profiles
                                                                              ●   5 types of queries
                                                                              ●   Experiment Setup:
                                                                                  ●   20 persons
                                                                                  ●   Sequential use
                                                                                  ➔   100 queries

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                               33
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                  query
                        per                                 ●   no reuse experiment:
                                                                0,01   0,1     1                       10         100


ContactInfoDanBri
                                                                ●   No data caching
  (Query No. 61)                                            ●   reuse per query experiment
UnsetPropsDanBri                                                ●   Reuse of query-local dataset
  (Query No. 62)                                                    for 3 executions of each query
2ndDegree1DanBri
                                                                ●   Third execution measured
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                    0,01             0,1         1         10         100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        34
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                  query
                        per                                 ●   no 0,01
                                                                   reuse experiment:
                                                                         0,1    1                        10       100


ContactInfoDanBri
                                                                ●   No data caching
  (Query No. 61)                                            ●   reuse per query experiment
UnsetPropsDanBri                                                ●   Reuse of query-local dataset
  (Query No. 62)                                                    for 3 executions of each query
2ndDegree1DanBri
                                                                ●   Third execution measured
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        35
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        36
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        37
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        38
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                  query
                        per  reuse 40
                             queries
                                                            ●   reuse all queries experiment: 100
                                                                       0,01   0,1   1    10


ContactInfoDanBri
                                                                ●   Reuse of the query-local
  (Query No. 61)                                                    dataset for the complete
                                                                    sequence of all 100 queries
UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  39
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  40
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  41
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  42
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  43
Outlook



 ●   Requirements of a data cache:
     ●   Replacement mechanism
     ●   Coherency mechanism




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   44
Cache Replacement
 ●   Cache full → remove descriptor objects
 ●   Replacement strategy
     ●   Primary goal: maximize hit rate
     ●   Recency-based
     ●   Frequency-based
     ●   Function-based
     ●   Randomized
 ●   Replacement process
     ●   Watermarks: high and low



Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   45
Studying Cache Replacement?




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   46
Studying Cache Replacement?

                      “Web cache replacement in its general
                       form seems to be a solved topic.”
                                           S. Podlipnig and L. Böszörmenyi: Survey of
                                           Web Cache Replacement Strategies, 2003




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data             47
Studying Cache Replacement?

                      “Web cache replacement in its general
                       form seems to be a solved topic.”
                                           S. Podlipnig and L. Böszörmenyi: Survey of
                                           Web Cache Replacement Strategies, 2003

 ●
     6 quad indexes* in main memory
     ●   Size grows linear in the number of quads
 ●   Example (after reuse all queries experiment, 100 queries):
     ●   905 descriptor objects, overall number of 745,756 triples
     ●   ca. 103 MB
 ➔   Available main memory is almost no limit
                                   *
                                     as introduced in [LDOW'11b]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data             48
Cache Coherency
 ●   Data items in the cache may become inconsistent
 ●   Strong cache consistency
     ●   Server validation
     ●   Client validation
 ●   Weak cache consistency
     ●   Time to live (TTL)
     ●   Adaptive TTL




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   49
Client Validation
 ●   Polling every time
 ●   Enables strong cache consistency
 ●   Conditional GET
     ●   Request with If-Modified-Since header
     ●   Possible response: 304 Not Modified




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   50
Client Validation
 ●   Polling every time
 ●   Enables strong cache consistency
 ●   Conditional GET
     ●   Request with If-Modified-Since header
     ●   Possible response: 304 Not Modified
 ●   Not supported by most Linked Data servers
     ●   Experiment based on the CKAN catalog of linked datasets



     ●   41 out of 154 example resources (26.6%) from 110 datasets


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   51
Time to Live (TTL)
 ●   TTL field: life time estimation for each object
 ●   Supported by HTTP response headers:
     ●   Expires
     ●   Cache-Control: max-age
 ●   When TTL elapses, object is invalid
 ●   Accessing an invalid object → re-retrieve object again
     ●   Conditional GET




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   52
Time to Live (TTL)
 ●   TTL field: life time estimation for each object
 ●   Supported by HTTP response headers:
     ●   Expires                                                       37.0%
     ●   Cache-Control: max-age                                         37.7%
 ●   When TTL elapses, object is invalid
 ●   Accessing an invalid object → re-retrieve object again
     ●   Conditional GET                                             26.6%
 ●   Alternative (due to lack of support in Linked Data servers):
     ●   Assume a default TTL for each object
     ●   Ordinary GET

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data     53
Adaptive TTL
 ●   Assumption:
     ●   The older an object, the less likely it is to be modified
 ●   TTL is a percentage of the age:
     ●   Threshold = 10% ; age = 30 days → TTL = 3 days
     ●   Last verification: yesterday → invalidation in 2 days
 ●   HTTP-based implementation:
     ●   Calculation of age: use Last-Modified response header
     ●   Verification with conditional GET




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   54
Adaptive TTL
 ●   Assumption:
     ●   The older an object, the less likely it is to be modified
 ●   TTL is a percentage of the age:
     ●   Threshold = 10% ; age = 30 days → TTL = 3 days
     ●   Last verification: yesterday → invalidation in 2 days
 ●   HTTP-based implementation:                                               35.1%
     ●   Calculation of age: use Last-Modified response header
     ●   Verification with conditional GET
 ●   Alternative (due to lack of support in Linked Data servers):
     ●   Assume Last-Modified is time of first retrieval
     ●   Verification by comparing a response to the current version
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data           55
Summary
 ●   Systematic analysis of the impact of data cache
     ●   Theoretical foundation
     ●   Conceptual analysis
     ●   Empirical evaluation
 ●   Main findings:
     ●   Additional results possible (for semantically similar queries)
     ●   Impact on performance may be positive but also negative
 ●   Future work:
     ●   Analysis of caching strategies in our context
     ●   Main issue: invalidation


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   56
Backup Slides




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   57
Contributions
 ●   Theoretical foundation (extension of the original definition)
     ●   Reachability by a Dseed-initialized execution of a BGP query b
     ●   Dseed-dependent solution for a BGP query b
     ●   Reachability R(B) for a serial execution of B = b1 , … , bn
     ➔   Each solution for bcur is also R(B)-dependent solution for bcur
 ●   Conceptual analysis of the impact of data caching
     ●   Performance factor: p( bcur , B ) = c( bcur , [ ] ) – c( bcur , B )
     ●   Serendipity factor: s( bcur , B ) = b( bcur , B ) – b( bcur , [ ] )
 ●   Empirical verification of the potential impact

 ●   Out of scope: Caching strategies (replacement, invalidation)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data    58
Query Template Contact
 SELECT * WHERE {             <PERSON> foaf:knows ?p .

                             OPTIONAL       {   ?p   foaf:name ?name }
                             OPTIONAL       {   ?p   foaf:firstName ?firstName }
                             OPTIONAL       {   ?p   foaf:givenName ?givenName }
                             OPTIONAL       {   ?p   foaf:givenname ?givenname }
                             OPTIONAL       {   ?p   foaf:familyName ?familyName }
                             OPTIONAL       {   ?p   foaf:family_name ?family_name }
                             OPTIONAL       {   ?p   foaf:lastName ?lastName }
                             OPTIONAL       {   ?p   foaf:surname ?surname }

                             OPTIONAL { ?p foaf:birthday ?birthday }

                             OPTIONAL { ?p foaf:img ?img }

                             OPTIONAL       {   ?p   foaf:phone ?phone }
                             OPTIONAL       {   ?p   foaf:aimChatID ?aimChatID }
                             OPTIONAL       {   ?p   foaf:icqChatID ?icqChatID }
                             OPTIONAL       {   ?p   foaf:jabberID ?jabberID }
                             OPTIONAL       {   ?p   foaf:msnChatID ?msnChatID }
                             OPTIONAL       {   ?p   foaf:skypeID ?skypeID }
                             OPTIONAL       {   ?p   foaf:yahooChatID ?yahooChatID }
 }

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data            59
Query Template UnsetProps
 SELECT DISTINCT ?result ?resultLabel WHERE
 {
    ?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> .
    ?result rdfs:domain foaf:Person .

       OPTIONAL { <PERSON> ?result ?var0 }
       FILTER ( !bound(?var0) )

       <PERSON> foaf:knows ?var2 .
       ?var2 ?result ?var3 .
       ?result rdfs:label ?resultLabel .
       ?result vs:term_status ?var1 .
 }
 ORDER BY ?var1




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   60
Query Template Incoming
 SELECT DISTINCT ?result WHERE
 {
   ?result foaf:knows <PERSON> .

       OPTIONAL
       {
         ?result foaf:knows ?var1 .
         FILTER ( <PERSON> = ?var1 )
         <PERSON> foaf:knows ?result .
       }
       FILTER ( !bound(?var1) )
 }


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   61
Query Template 2ndDegree1
 SELECT DISTINCT ?result WHERE
 {
    <PERSON> foaf:knows ?p1 .
    <PERSON> foaf:knows ?p2 .
    FILTER ( ?p1 != ?p2 )

       ?p1 foaf:knows ?result .
       FILTER ( <PERSON> != ?result )
       ?p2 foaf:knows ?result .

       OPTIONAL {
          <PERSON> ?knows ?result .
          FILTER ( ?knows = foaf:knows )
       }
       FILTER ( !bound(?knows) )
 }
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   62
Query Template 2ndDegree2
 SELECT DISTINCT ?result WHERE
 {
    <PERSON> foaf:knows ?p1 .
    <PERSON> foaf:knows ?p2 .
    FILTER ( ?p1 != ?p2 )

       ?result foaf:knows ?p1 .
       FILTER ( <PERSON> != ?result )
       ?result foaf:knows ?p2 .

       OPTIONAL {
          <PERSON> ?knows ?result .
          FILTER ( ?knows = foaf:knows )
       }
       FILTER ( !bound(?knows) )
 }
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   63
Experiment – Single Query
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                                                       1
                                                                           Averaged over all 100 queries
 ●   In the ideal case for Bupper= [ bcur , bcur ] :
     ●   pupper( bcur , Bupper ) = c( bcur , [ ] ) – c( bcur , Bupper ) = c( bcur , [ ] )
     ●   supper( bcur , Bupper ) = b( bcur , Bupper ) – b( bcur , [ ] ) = 0

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         64
Experiment – Single Query
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                                                       1
                                                                           Averaged over all 100 queries

 ●   Summary (measurement errors aside):
     ●   Same number of query results
     ●   Significant improvements in query performance

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         65
Experiment – Complete Sequence
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                           44.87                              0.991              37.91 s
 reuse all queries
                                                  (178.36)                          (0.053)           (112.94)
                                                                       1
                                                                           Averaged over all 100 queries
 ●   Summary:
     ●   Data cache may provide for additional query results
     ●   Impact on performance may be positive but also negative
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         66
Experiment – Complete Sequence
     Experiment                  Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
        no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                           44.87                              0.991              37.91 s
 reuse all queries
                                                  (178.36)                          (0.053)           (112.94)
 reuse all queries                        118.18                              0.992              20.61 s
 (random orders)                                  (867.07)                          (0.016)           (216.61)

 ●   Executing the query sequence in a random order results in
     measurements similar to the given order.
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         67
These slides have been created by
                                       Olaf Hartig

                                              http://olafhartig.de


                     This work is licensed under a
       Creative Commons Attribution-Share Alike 3.0 License
           (http://creativecommons.org/licenses/by-sa/3.0/)




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   68

Más contenido relacionado

La actualidad más candente

Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text miningIRJET Journal
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHIJDKP
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbenchEuropean Data Forum
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationBoris Glavic
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Abhay Ratnaparkhi
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievaleSAT Journals
 
Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationSonia Haiduc
 
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender SystemIRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender SystemIRJET Journal
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysRoy Clariana
 

La actualidad más candente (18)

Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
Ir
IrIr
Ir
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
 
Z04506138145
Z04506138145Z04506138145
Z04506138145
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
 
BoTLRet: A Template-based Linked Data Information Retrieval
 BoTLRet: A Template-based Linked Data Information Retrieval BoTLRet: A Template-based Linked Data Information Retrieval
BoTLRet: A Template-based Linked Data Information Retrieval
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Fasta
FastaFasta
Fasta
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrieval
 
Sub1579
Sub1579Sub1579
Sub1579
 
Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
 
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender SystemIRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
 
Blast 2013 1
Blast 2013 1Blast 2013 1
Blast 2013 1
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
 

Similar a The Impact of Data Caching of on Query Execution for Linked Data

Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingPeter Haase
 
Seaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD WorkshopSeaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD WorkshopHao Wu
 
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling QueriesLODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling QueriesAnja Jentzsch
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineSalford Systems
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda ProvenanceVlad Vega
 
Linked Data Query Processing Strategies
Linked Data Query Processing StrategiesLinked Data Query Processing Strategies
Linked Data Query Processing StrategiesThanh Tran
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink pointsVenkatesh Neerukonda
 
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...Koichi Shirahata
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollinkSSSW
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Jean Brenda
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Web Access Log Management
Web Access Log ManagementWeb Access Log Management
Web Access Log ManagementJay Patel
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...COST Action TD1210
 

Similar a The Impact of Data Caching of on Query Execution for Linked Data (20)

Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Seaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD WorkshopSeaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD Workshop
 
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling QueriesLODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
 
UNIT V.pdf
UNIT V.pdfUNIT V.pdf
UNIT V.pdf
 
Linked Data and Sevices
Linked Data and SevicesLinked Data and Sevices
Linked Data and Sevices
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
 
Linked Data Query Processing Strategies
Linked Data Query Processing StrategiesLinked Data Query Processing Strategies
Linked Data Query Processing Strategies
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink points
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Web Access Log Management
Web Access Log ManagementWeb Access Log Management
Web Access Log Management
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
 
SIGIR 2011
SIGIR 2011SIGIR 2011
SIGIR 2011
 

Más de Olaf Hartig

LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataOlaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebOlaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryOlaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)Olaf Hartig
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataOlaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Olaf Hartig
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Olaf Hartig
 
Linked Data on the Web
Linked Data on the WebLinked Data on the Web
Linked Data on the WebOlaf Hartig
 
Executing SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked DataExecuting SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked DataOlaf Hartig
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 

Más de Olaf Hartig (20)

LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the Web
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and Query
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked Data
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)
 
Linked Data on the Web
Linked Data on the WebLinked Data on the Web
Linked Data on the Web
 
Executing SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked DataExecuting SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked Data
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 

Último

Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

The Impact of Data Caching of on Query Execution for Linked Data

  • 1. The Impact of Data Caching on Query Execution for Linked Data Olaf Hartig http://olafhartig.de/foaf.rdf#olaf @olafhartig Database and Information Systems Research Group Humboldt-Universität zu Berlin
  • 2. Can we query the Web of Data as of it were a single, giant database? SELECT DISTINCT ?i ?label WHERE { ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i . } OPTIONAL { } ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") ORDER BY ?label ? Our approach: Link Traversal Based Query Execution [ISWC'09] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 2
  • 3. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 3
  • 4. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 4
  • 5. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 5
  • 6. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 6
  • 7. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 7
  • 8. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data “Descriptor object” to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 8
  • 9. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 9
  • 10. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://bob.name Query kno ws http://bob.name http://alice.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 10
  • 11. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://bob.name Query kno ws http://bob.name http://alice.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 11
  • 12. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 12
  • 13. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 13
  • 14. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 14
  • 15. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 15
  • 16. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 16
  • 17. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://alice.name Query pr o http://bob.name jec t ?prjName http://.../AlicesPrj s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 17
  • 18. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset http://alice.name Query pr o http://bob.name jec t ?prjName http://.../AlicesPrj s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 18
  • 19. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 19
  • 20. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset ?prj ?prjName http://.../AlicesPrj “…“ Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 20
  • 21. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset ?prj ?prjName http://.../AlicesPrj “…“ Query ?acq ?prj ?prjName http://bob.name ?prjName http://alice.name http://.../AlicesPrj “…“ s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 21
  • 22. Characteristics ● Link traversal based query execution: ● Evaluation on a continuously augmented dataset ● Discovery of potentially relevant data during execution ● Discovery driven by intermediate solutions ● Main advantage: ● No need to know all data sources in advance ● Limitations: ● Query has to contain a URI as a starting point ● Ignores data that is not reachable* by the query execution * formal definition in [LDOW'11a] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 22
  • 23. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 23
  • 24. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel htt query-local p: //b ob dataset ? .nam e Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 24
  • 25. The Issue Query ?acq interest http://bob.name ?i kno s ow w s label kn http://alice.name http://bob.name ?iLabel query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 25
  • 26. The Issue Query ?acq interest http://bob.name ?i kno s ow w s label kn http://alice.name http://bob.name ?iLabel query-local dataset ?acq ?i ?iLabel Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 26
  • 27. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 27
  • 28. Reusing the Query-Local Dataset Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 28
  • 29. Reusing the Query-Local Dataset Query ?acq interest ?i s ow label kn http://bob.name ?iLabel http://alice.name o ws Query kn http://bob.name http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 29
  • 30. Reusing the Query-Local Dataset Query ?acq interest ?i ?acq s ow http://alice.name label kn http://bob.name ?iLabel http://alice.name o ws Query kn http://bob.name http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 30
  • 31. Hypothesis Re-using the query-local dataset (a.k.a. data caching) may benefit query performance + result completeness Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 31
  • 32. Contributions ● Systematic analysis of the impact of data caching ● Theoretical foundation* ● Conceptual analysis* ● Empirical evaluation of the potential impact * see [LDOW'11a] ● Out of scope: Caching strategies (replacement, invalidation) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 32
  • 33. Experiment – Scenario ● Information about the distributed social network of FOAF profiles ● 5 types of queries ● Experiment Setup: ● 20 persons ● Sequential use ➔ 100 queries Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 33
  • 34. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 query per ● no reuse experiment: 0,01 0,1 1 10 100 ContactInfoDanBri ● No data caching (Query No. 61) ● reuse per query experiment UnsetPropsDanBri ● Reuse of query-local dataset (Query No. 62) for 3 executions of each query 2ndDegree1DanBri ● Third execution measured (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 34
  • 35. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 query per ● no 0,01 reuse experiment: 0,1 1 10 100 ContactInfoDanBri ● No data caching (Query No. 61) ● reuse per query experiment UnsetPropsDanBri ● Reuse of query-local dataset (Query No. 62) for 3 executions of each query 2ndDegree1DanBri ● Third execution measured (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 35
  • 36. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 36
  • 37. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 37
  • 38. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 38
  • 39. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 query per reuse 40 queries ● reuse all queries experiment: 100 0,01 0,1 1 10 ContactInfoDanBri ● Reuse of the query-local (Query No. 61) dataset for the complete sequence of all 100 queries UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 39
  • 40. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 40
  • 41. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 41
  • 42. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 42
  • 43. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 43
  • 44. Outlook ● Requirements of a data cache: ● Replacement mechanism ● Coherency mechanism Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 44
  • 45. Cache Replacement ● Cache full → remove descriptor objects ● Replacement strategy ● Primary goal: maximize hit rate ● Recency-based ● Frequency-based ● Function-based ● Randomized ● Replacement process ● Watermarks: high and low Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 45
  • 46. Studying Cache Replacement? Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 46
  • 47. Studying Cache Replacement? “Web cache replacement in its general form seems to be a solved topic.” S. Podlipnig and L. Böszörmenyi: Survey of Web Cache Replacement Strategies, 2003 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 47
  • 48. Studying Cache Replacement? “Web cache replacement in its general form seems to be a solved topic.” S. Podlipnig and L. Böszörmenyi: Survey of Web Cache Replacement Strategies, 2003 ● 6 quad indexes* in main memory ● Size grows linear in the number of quads ● Example (after reuse all queries experiment, 100 queries): ● 905 descriptor objects, overall number of 745,756 triples ● ca. 103 MB ➔ Available main memory is almost no limit * as introduced in [LDOW'11b] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 48
  • 49. Cache Coherency ● Data items in the cache may become inconsistent ● Strong cache consistency ● Server validation ● Client validation ● Weak cache consistency ● Time to live (TTL) ● Adaptive TTL Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 49
  • 50. Client Validation ● Polling every time ● Enables strong cache consistency ● Conditional GET ● Request with If-Modified-Since header ● Possible response: 304 Not Modified Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 50
  • 51. Client Validation ● Polling every time ● Enables strong cache consistency ● Conditional GET ● Request with If-Modified-Since header ● Possible response: 304 Not Modified ● Not supported by most Linked Data servers ● Experiment based on the CKAN catalog of linked datasets ● 41 out of 154 example resources (26.6%) from 110 datasets Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 51
  • 52. Time to Live (TTL) ● TTL field: life time estimation for each object ● Supported by HTTP response headers: ● Expires ● Cache-Control: max-age ● When TTL elapses, object is invalid ● Accessing an invalid object → re-retrieve object again ● Conditional GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 52
  • 53. Time to Live (TTL) ● TTL field: life time estimation for each object ● Supported by HTTP response headers: ● Expires 37.0% ● Cache-Control: max-age 37.7% ● When TTL elapses, object is invalid ● Accessing an invalid object → re-retrieve object again ● Conditional GET 26.6% ● Alternative (due to lack of support in Linked Data servers): ● Assume a default TTL for each object ● Ordinary GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 53
  • 54. Adaptive TTL ● Assumption: ● The older an object, the less likely it is to be modified ● TTL is a percentage of the age: ● Threshold = 10% ; age = 30 days → TTL = 3 days ● Last verification: yesterday → invalidation in 2 days ● HTTP-based implementation: ● Calculation of age: use Last-Modified response header ● Verification with conditional GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 54
  • 55. Adaptive TTL ● Assumption: ● The older an object, the less likely it is to be modified ● TTL is a percentage of the age: ● Threshold = 10% ; age = 30 days → TTL = 3 days ● Last verification: yesterday → invalidation in 2 days ● HTTP-based implementation: 35.1% ● Calculation of age: use Last-Modified response header ● Verification with conditional GET ● Alternative (due to lack of support in Linked Data servers): ● Assume Last-Modified is time of first retrieval ● Verification by comparing a response to the current version Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 55
  • 56. Summary ● Systematic analysis of the impact of data cache ● Theoretical foundation ● Conceptual analysis ● Empirical evaluation ● Main findings: ● Additional results possible (for semantically similar queries) ● Impact on performance may be positive but also negative ● Future work: ● Analysis of caching strategies in our context ● Main issue: invalidation Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 56
  • 57. Backup Slides Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 57
  • 58. Contributions ● Theoretical foundation (extension of the original definition) ● Reachability by a Dseed-initialized execution of a BGP query b ● Dseed-dependent solution for a BGP query b ● Reachability R(B) for a serial execution of B = b1 , … , bn ➔ Each solution for bcur is also R(B)-dependent solution for bcur ● Conceptual analysis of the impact of data caching ● Performance factor: p( bcur , B ) = c( bcur , [ ] ) – c( bcur , B ) ● Serendipity factor: s( bcur , B ) = b( bcur , B ) – b( bcur , [ ] ) ● Empirical verification of the potential impact ● Out of scope: Caching strategies (replacement, invalidation) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 58
  • 59. Query Template Contact SELECT * WHERE { <PERSON> foaf:knows ?p . OPTIONAL { ?p foaf:name ?name } OPTIONAL { ?p foaf:firstName ?firstName } OPTIONAL { ?p foaf:givenName ?givenName } OPTIONAL { ?p foaf:givenname ?givenname } OPTIONAL { ?p foaf:familyName ?familyName } OPTIONAL { ?p foaf:family_name ?family_name } OPTIONAL { ?p foaf:lastName ?lastName } OPTIONAL { ?p foaf:surname ?surname } OPTIONAL { ?p foaf:birthday ?birthday } OPTIONAL { ?p foaf:img ?img } OPTIONAL { ?p foaf:phone ?phone } OPTIONAL { ?p foaf:aimChatID ?aimChatID } OPTIONAL { ?p foaf:icqChatID ?icqChatID } OPTIONAL { ?p foaf:jabberID ?jabberID } OPTIONAL { ?p foaf:msnChatID ?msnChatID } OPTIONAL { ?p foaf:skypeID ?skypeID } OPTIONAL { ?p foaf:yahooChatID ?yahooChatID } } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 59
  • 60. Query Template UnsetProps SELECT DISTINCT ?result ?resultLabel WHERE { ?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> . ?result rdfs:domain foaf:Person . OPTIONAL { <PERSON> ?result ?var0 } FILTER ( !bound(?var0) ) <PERSON> foaf:knows ?var2 . ?var2 ?result ?var3 . ?result rdfs:label ?resultLabel . ?result vs:term_status ?var1 . } ORDER BY ?var1 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 60
  • 61. Query Template Incoming SELECT DISTINCT ?result WHERE { ?result foaf:knows <PERSON> . OPTIONAL { ?result foaf:knows ?var1 . FILTER ( <PERSON> = ?var1 ) <PERSON> foaf:knows ?result . } FILTER ( !bound(?var1) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 61
  • 62. Query Template 2ndDegree1 SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?p1 foaf:knows ?result . FILTER ( <PERSON> != ?result ) ?p2 foaf:knows ?result . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 62
  • 63. Query Template 2ndDegree2 SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?result foaf:knows ?p1 . FILTER ( <PERSON> != ?result ) ?result foaf:knows ?p2 . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 63
  • 64. Experiment – Single Query Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 1 Averaged over all 100 queries ● In the ideal case for Bupper= [ bcur , bcur ] : ● pupper( bcur , Bupper ) = c( bcur , [ ] ) – c( bcur , Bupper ) = c( bcur , [ ] ) ● supper( bcur , Bupper ) = b( bcur , Bupper ) – b( bcur , [ ] ) = 0 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 64
  • 65. Experiment – Single Query Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 1 Averaged over all 100 queries ● Summary (measurement errors aside): ● Same number of query results ● Significant improvements in query performance Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 65
  • 66. Experiment – Complete Sequence Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 44.87 0.991 37.91 s reuse all queries (178.36) (0.053) (112.94) 1 Averaged over all 100 queries ● Summary: ● Data cache may provide for additional query results ● Impact on performance may be positive but also negative Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 66
  • 67. Experiment – Complete Sequence Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 44.87 0.991 37.91 s reuse all queries (178.36) (0.053) (112.94) reuse all queries 118.18 0.992 20.61 s (random orders) (867.07) (0.016) (216.61) ● Executing the query sequence in a random order results in measurements similar to the given order. Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 67
  • 68. These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 68