SlideShare a Scribd company logo
1 of 19
Download to read offline
Dependable Cardinality Forecasts for XQuery
                   Jens Teubner, ETH (formerly IBM Research)
                   Torsten Grust, U T¨bingen (formerly TUM)
                                       u
                     Sebastian Maneth, UNSW and NICTA
                          Sherif Sakr, UNSW and NICTA




c Systems Group — Department of Computer Science — ETH Z¨rich
                                                        u       August 26, 2008
Cardinality Estimation for XQuery
   The feature-richness and semantics of the language make
   cardinality estimation for XQuery notoriously hard.
                    for $d in doc ("forecast.xml")/descendant::day
                    let $day := $d/@t
                    let $ppcp := data ($d/descendant::ppcp)
                    return
                      if ($ppcp > 50)
                        then ("rain likely on", $day,
                              "chance of precipitation:", $ppcp)
                        else ("no rain on", $day)

         for iteration                                      sequence construction
         conditionals (if-then-else)                        ...
         existential quantification                     (XPath is not a focus of this work.)



  August 26, 2008       Systems Group — Department of Computer Science — ETH Z¨rich
                                                                              u                2
Cardinality Estimation for XQuery
   The feature-richness and semantics of the language make
   cardinality estimation for XQuery notoriously hard.
                    for $d in doc ("forecast.xml")/descendant::day card1 = ?
                    let $day := $d/@t
                    let $ppcp := data ($d/descendant::ppcp)
                    return
                      if ($ppcp > 50)                              card2 = ?
                        then ("rain likely on", $day,            card3 = ?
                              "chance of precipitation:", $ppcp)
                        else ("no rain on", $day) card4 = ?

         for iteration                                      sequence construction
         conditionals (if-then-else)                        ...
         existential quantification                     (XPath is not a focus of this work.)

     → Goal: Compute subexpression-level cardinalities cardi .

  August 26, 2008       Systems Group — Department of Computer Science — ETH Z¨rich
                                                                              u                2
Idea: Perform cardinality estimation on relational plan
      equivalents for XQuery.

                    relational cardinality estimation (System R)
                  + existing work on XPath estimation
                  + histograms for value predicates
                  = cardinality estimation for XQuery


           Build on Pathfinder’s XQuery-to-relational algebra compiler.1
            → tuple count ≡ XQuery item count
           Cardinality information for each subexpression.


      1
          http://www.pathfinder-xquery.org/
August 26, 2008       Systems Group — Department of Computer Science — ETH Z¨rich
                                                                            u       3
πiter:outer,pos:pos1,item

                                                                                                          pos1: ord,pos        outer

Example (Plan details not of interest today.)                                     2                             iter=inner

                                                                                        ·
                                                                                        ∪
                                                                                                               4
                                              1                3                                   πiter,pos:pos1,item

for $d in doc ("forecast.xml")/descendant::day            πiter,pos:pos1,item                          pos1: ord,pos      iter

let $day := $d/@t                                          pos1: ord,pos        iter                           ·
                                                                                                               ∪
                                                                                  πiter,pos:1,ord:1,
let $ppcp := data ($d/descendant::ppcp)                               ·
                                                                      ∪               item:"rain..."
                                                                                                                          ·
                                                                                                                          ∪
return                                      2                   πiter,pos:1,ord:1,
                                                                                               πiter,pos,item,ord:2             ∪·
  if ($ppcp > 50)                          3                       item:"no rain..."                             πiter,pos:1,ord:3,
                                                                                                                      item:"chance..."
    then ("rain likely on", $day,                   πiter,pos,item,ord:2                                                          πiter,pos,item,ord:4

          "chance of precipitation:", $ppcp)               iter1=iter                                    iter1=iter                     iter=iter1
    else ("no rain on", $day) 4                                               
                                                                                                   δ
                                                                                              πiter

                                                                                               σres

                                                                                          res:(item,item1)
      Pathfinder compiles XQuery                                                              iter1=iter                                  πiter1:iter,
                                                πiter1:iter,pos,item
      with arbitrary nesting.                                                 πiter1:iter,
                                                                                pos1:1,item1:50                item
                                                                                                                                            pos,item


                                                  pos: item    iter

      Maintain correspondence to                                          πiter                          pos: item      iter

                                                item:attribute::t(item)                      item:descendant::ppcp(item)
      original query if back-end is                                               πiter:inner,item                                            πinner,outer:iter,
                                                                                                                                                ord:pos

      not relational.                                                                 inner: iter,pos

                                                                          1           pos: item         iter

      Derive estimates based on an                                        item:descendant::day(item)


      inference rule set.                                                               docitem
                                                                          πiter,item:"forecast.xml"

                                                                                            iter
                                                                                              1
Relational XQuery Cardinality Estimation
   Apply System R-style estimation to relational XQuery plans, e.g.,

   Disjoint union:                               Cartesian product:
          ·
      |q1 ∪ q2 | = |q1 | + |q2 |                   |q1 × q2 | = |q1 | · |q2 |

   Equi-join:     
                  
                        |q1 | · |q2 |                         if there are indexes on
                   max {|a| , |b| }                           both join columns,
                  
                  
                  
                              idx       idx
                  
       |q1 q2 | =        |q1 | · |q2 |                         if there is only an index
          a=b
                             |a|idx                            on column a,
                  
                  
                  
                  
                  
                  
                  
                      |q1 | · |q2 | · 1/10                     otherwise
                  



       |c|idx : Number of unique values in index on column c.
  August 26, 2008      Systems Group — Department of Computer Science — ETH Z¨rich
                                                                             u             5
Relational XQuery Cardinality Estimation
   Apply System R-style estimation to relational XQuery plans, e.g.,

   Disjoint union:                               Cartesian product:
          ·
      |q1 ∪ q2 | = |q1 | + |q2 |                   |q1 × q2 | = |q1 | · |q2 |

   Equi-join:     
                         |q1 | · |q2 |                         if there are indexes on
                  
                  
                   max {|a| , |b| }
                  
                  
                  
                              idx       idx
                                                               both join columns,          ?
                  
       |q1 q2 | =        |q1 | · |q2 |                         if there is only an index
          a=b     
                  
                  
                  
                  
                  
                             |a|idx                            on column a,                ?
                  
                      |q1 | · |q2 | · 1/10                     otherwise
                  

          Our joins typically operate over computed relations.
       |c|idx : Number of unique values in index on column c.
  August 26, 2008      Systems Group — Department of Computer Science — ETH Z¨rich
                                                                             u             5
Abstract Domain Identifiers
   A simple form of data flow analysis provides the information
   needed.
             Introduce abstract domain identifiers α, β, . . . as placeholders
             for the active runtime domain for each column c.
             (Read c α as “column c contains values from domain α.”)
             Estimate the size α of each domain α, e.g.,2

                    dom      a: b1 ,...,bn   (q) ⊇ dom (q) ∪ aα ∧ α =! |q|                     .

             Identify inclusion relationships α                      β between domains, e.g.,

                       aα ∈ dom (q) ∧ aβ ∈ dom (σ··· (q)) ⇒ β                            α .

        2
            Operator   a: b1 ,...,bn   introduces a new key column (holding row numbers).
  August 26, 2008          Systems Group — Department of Computer Science — ETH Z¨rich
                                                                                 u                 6
Abstract Domain Identifiers
   Use abstract domain information for cardinality estimation.
   E.g., “foreign key” join:

                    aα ∈ dom (q1 )        bβ ∈ dom (q2 )                      α      β
                                              |q1 | · |q2 |
                                   |q1 q2 | =
                                      a=b          β

           Domain inclusion guarantees that each tuple in q1 finds
           (at least one) join partner in q2 .

   Other examples:
           |q1  q2 | = |q1 | − |q2 |        if q2 is a subset of q1 .
           |q1  q2 | = 0                    if q1 is a subset of q2 .
           |q1  q2 | = |q1 |                if q1 and q2 are disjoint.
  August 26, 2008      Systems Group — Department of Computer Science — ETH Z¨rich
                                                                             u           7
Interfacing with XPath—Projection Paths

   Track XPath navigation by means of projection paths3

                        a                        a        b
                                                                                a    b     c
                                                               c:child::*(b)    1    γa   γb
                                                 1        γa
                    b   c       d                2        γb
                                                                                1    γa   γc
                                                                                1    γa   γd
                                                 3        γd
                                e                                               3    γd   γe
                                                     q1
                                                                                     q2


             b⇒p ∈ path (q1 ) ⇒ c⇒p/child::* ∈ path (q2 )
             Step operator    makes XPath navigation explicit in relational
             plans (compiles to join on SQL back-ends).



        3
            A. Marian and J. Sim´ on. Projecting XML Documents. VLDB 2003.
                                e
  August 26, 2008           Systems Group — Department of Computer Science — ETH Z¨rich
                                                                                  u            8
Interfacing with XPath—Cardinality Inference
                                                                                        bα
                                                      bα
                        a                        a        b
                                                                                   a     b     c
                                                                 c:child::*(b)     1     γa   γb
                                                 1        γa
                    b   c       d                2        γb
                                                                                   1     γa   γc    α    α
                                                                                   1     γa   γd
                                                 3        γd
                                e                                                  3     γd   γe
                                                     q1
                                                                                         q2
   Cardinality:
                            fn:count (p/child::*)
       |q2 | = |q1 | ·          fn:count (p)                   = |q1 | · Prchild::* (p)

                                                                                 fanout (here: 4/3)
   Domain Sizes:
                               fn:count (p [ child::* ])
     α = α ·                        fn:count (p)                   = α · Pr[child::*] (p)

                                                                                       selectivity (here: 2/3)
   Any XPath estimator that provides Prp2 (p1 ) and Pr[p2 ] (p1 ) will do.
           Our prototype uses a simple Data Guide-based implementation.
  August 26, 2008           Systems Group — Department of Computer Science — ETH Z¨rich
                                                                                  u                       9
πiter:outer,pos:pos1,item

                                                                                                                  pos1: ord,pos        outer

Back to our Example Plan                                                                        2                       iter=inner

                                                                                                ·
                                                                                                ∪
                                                                                                                       4
                                                                                                           πiter,pos:pos1,item
                                                                        3
                                                                   πiter,pos:pos1,item                         pos1: ord,pos      iter
                      res:(item,item1)
                                                                    pos1: ord,pos        iter                          ·
                                                                                                                       ∪
                                                                                           πiter,pos:1,ord:1,
                                                                               ·
                                                                               ∪                                                  ·
                                                                                                                                  ∪
                         iter1=iter                                                          item:"rain..."
                                                                                                       πiter,pos,item,ord:2             ∪·
                                                                         πiter,pos:1,ord:1,
       πiter1:iter,                                                         item:"no rain..."                            πiter,pos:1,ord:3,
                                                                                                                              item:"chance..."
         pos1:1,item1:50                   item              πiter,pos,item,ord:2                                                         πiter,pos,item,ord:4

                                                                    iter1=iter                                   iter1=iter                     iter=iter1
  πiter                               pos: item   iter
                                                                                       
                                                                                                           δ
                           item:descendant::ppcp(item)                                                πiter

                                                                                                       σres
            πiter:inner,item
                                                                                                res:(item,item1)

                                                                                                      iter1=iter                                  πiter1:iter,
                                                         πiter1:iter,pos,item
                                                                                                                                                    pos,item
                                                                                       πiter1:iter,
                                                                                         pos1:1,item1:50               item
                                                           pos: item    iter
                                                                                   πiter                         pos: item      iter

                                                         item:attribute::t(item)                      item:descendant::ppcp(item)

                                                                                           πiter:inner,item                                           πinner,outer:iter,
                                                                                                                                                        ord:pos
                                                                                            inner: iter,pos

                                                                                   1        pos: item           iter

                                                                                   item:descendant::day(item)

                                                                                                docitem

 a:   Retrieve typed values for node identifiers                                   πiter,item:"forecast.xml"

      in column a (atomization).                                                                    iter
                                                                                                      1
πiter:outer,pos:pos1,item

                                                                                                               pos1: ord,pos         outer

Back to our Example Plan                                                                         2                    iter=inner

                                                                                                 ·
                                                                                                 ∪
                                                                                                                     4
                                                                                                            πiter,pos:pos1,item
                                                                         3
                                                                    πiter,pos:pos1,item                     pos1: ord,pos       iter
                      res:(item,item1)
                                                                     pos1: ord,pos        iter                       ·
                                                                                                                     ∪
                                                                                           πiter,pos:1,ord:1,
                                                                                ·
                                                                                ∪                                               ·
                                                                                                                                ∪
                         iter1=iter                                                          item:"rain..."
                                                                                                        πiter,pos,item,ord:2          ∪·
                                                                          πiter,pos:1,ord:1,
       πiter1:iter,                                                          item:"no rain..."                         πiter,pos:1,ord:3,
                                                                                                                            item:"chance..."
         pos1:1,item1:50                   item                πiter,pos,item,ord:2                                                     πiter,pos,item,ord:4

                                                                     iter1=iter                                iter1=iter                     iter=iter1
  πiter                               pos: item   iter
                                                         Projection path: δ
                                                                       

                           item:descendant::ppcp(item)    item⇒···/descendant::ppcp ∈ path
                                                                             πiter                                                                  ... (q)
                                                                                                        σres
            πiter:inner,item
                                                         Cardinality (uses fanout):
                                                                                res:(item,item1)


                                                               item:ax::nt(item) (q) = |q|·Prax::nt (· · · )
                                                           πiter1:iter,pos,item
                                                                                    iter1=iter   πiter1:iter,
                                                                                                  pos,item
                                                                                        πiter1:iter,
                                                                                          pos1:1,item1:50            item
                                                             pos: item   iter
                                                                                    πiter                      pos: item      iter

                                                            item:attribute::t(item)                    item:descendant::ppcp(item)

                                                                                            πiter:inner,item                                        πinner,outer:iter,
                                                                                                                                                      ord:pos
                                                                                             inner: iter,pos

                                                                                    1        pos: item       iter

                                                                                    item:descendant::day(item)

                                                                                                 docitem

 a:   Retrieve typed values for node identifiers                                    πiter,item:"forecast.xml"

      in column a (atomization).                                                                     iter
                                                                                                       1
πiter:outer,pos:pos1,item

                                                                                                                      pos1: ord,pos        outer

Back to our Example Plan                                                                                2                    iter=inner

                                                                                                        ·
                                                                                                        ∪
                                                                                                                            4
                                                                                                                   πiter,pos:pos1,item
                                                                                  3
                                                                             πiter,pos:pos1,item                   pos1: ord,pos       iter
                      res:(item,item1)
                                                                              pos1: ord,pos      iter                       ·
                                                                                                                            ∪
                                                                                                  πiter,pos:1,ord:1,
                                                                                      ·
                                                                                      ∪                                                ·
                                                                                                                                       ∪
                         iter1=iter                                                                 item:"rain..."
                                                                                                               πiter,pos,item,ord:2          ∪·
                                                                                   πiter,pos:1,ord:1,
       πiter1:iter,                                                                   item:"no rain..."                       πiter,pos:1,ord:3,
                                                                                                                                   item:"chance..."
         pos1:1,item1:50                   item                         πiter,pos,item,ord:2                                                  πiter,pos,item,ord:4

                                                                              iter1=iter                              iter1=iter                    iter=iter1
  πiter                               pos: item    iter
                                                          iterα   Projection path: δ
                                                                                

                           item:descendant::ppcp(item)             item⇒···/descendant::ppcp ∈ path
                                                                                      πiter                                                               ... (q)
                                                                                                               σres
            πiter:inner,item            iter   α
                                                                  Cardinality (uses fanout):
                                                                                         res:(item,item1)


                                                                        item:ax::nt(item) (q) = |q|·Prax::nt (· · · )
                                                                    πiter1:iter,pos,item
                                                                                             iter1=iter   πiter1:iter,
                                                                                                           pos,item
                                                                                               πiter1:iter,
                                                                                                 pos1:1,item1:50            item
                                                                  Domain iter πiter
                                                                      pos: item
                                                                                  inclusion: pos: item iter
                                                                  iterα ∈ dom
                                                                    item:attribute::t(item) ... (q) ; α                α
                                                                                             item:descendant::ppcp(item)

                                                                                                   πiter:inner,item                                       πinner,outer:iter,
                                                                                                                                                            ord:pos

                                                                  Domain size (uses step selectivity):
                                                                               inner: iter,pos


                                                                   α = α 1· Pr[ax::nt] (· · · )
                                                                               pos: item iter

                                                                                           item:descendant::day(item)

                                                                                                        docitem

 a:   Retrieve typed values for node identifiers                                           πiter,item:"forecast.xml"

      in column a (atomization).                                                                            iter
                                                                                                              1
πiter:outer,pos:pos1,item

                                                                                                                     pos1: ord,pos       outer

Back to our Example Plan                                                                               2                    iter=inner

                                                                                                       ·
                                                                                                       ∪
                                                                                                                           4
                                                                                                                  πiter,pos:pos1,item
                                                                                 3
                                                                            πiter,pos:pos1,item                   pos1: ord,pos       iter
                   res:(item,item1)
                                                                             pos1: ord,pos      iter                       ·
                                                                                                                           ∪
                                                                                                 π
 iter1α               iter1=iter                                “Foreign key” iter,pos:1,ord:1,
                                                                          ∪·           join:
                                                                                      item:"rain..."             ∪·

                                                iterα            q1       q2 rain..." α
                                                                         item:"no =
                                                                                              πiter,pos,item,ord:2
                                                                        πiter,pos:1,ord:1, |q1 |·|q2 |
                                                                                                            (since α
                                                                                                                          ∪·
                                                                                                                                                                α)
    πiter1:iter,                                                                                           πiter,pos:1,ord:3,
                                                                      iter1=iter                                                  item:"chance..."
        pos1:1,item1:50                 item                           πiter,pos,item,ord:2                                                  πiter,pos,item,ord:4

                                                                             iter1=iter                              iter1=iter                    iter=iter1
  πiter                            pos: item     iter
                                                        iterα   Projection path: δ
                                                                              

                        item:descendant::ppcp(item)              item⇒···/descendant::ppcp ∈ path
                                                                                    πiter                                                                ... (q)
                                                                                                              σres
           πiter:inner,item          iter   α
                                                                Cardinality (uses fanout):
                                                                                       res:(item,item1)


                                                                      item:ax::nt(item) (q) = |q|·Prax::nt (· · · )
                                                                  πiter1:iter,pos,item
                                                                                           iter1=iter   πiter1:iter,
                                                                                                         pos,item
                                                                                              πiter1:iter,
                                                                                                pos1:1,item1:50            item
                                                                Domain iter πiter
                                                                    pos: item
                                                                                inclusion: pos: item iter
                                                                iterα ∈ dom
                                                                  item:attribute::t(item) ... (q) ; α                α
                                                                                           item:descendant::ppcp(item)

                                                                                                  πiter:inner,item                                       πinner,outer:iter,
                                                                                                                                                           ord:pos

                                                                Domain size (uses step selectivity):
                                                                             inner: iter,pos


                                                                 α = α 1· Pr[ax::nt] (· · · )
                                                                             pos: item iter

                                                                                          item:descendant::day(item)

                                                                                                       docitem

 a:   Retrieve typed values for node identifiers                                          πiter,item:"forecast.xml"

      in column a (atomization).                                                                           iter
                                                                                                             1
πiter:outer,pos:pos1,item

                                                                        card2 : 4104                                   pos1: ord,pos       outer

Back to our Example Plan                                                                                 2                    iter=inner

                                                            card3 : 3540                                 ·
                                                                                                         ∪
                                                                                                                             4                     card4 : 564
                                                                                                                    πiter,pos:pos1,item
                                                                                   3
                                                                              πiter,pos:pos1,item                   pos1: ord,pos       iter
                   res:(item,item1)
                                                                               pos1: ord,pos      iter                       ·
                                                                                                                             ∪
                                                                                                   π
 iter1α               iter1=iter                                  “Foreign key” iter,pos:1,ord:1,
                                                                            ∪·           join:
                                                                                        item:"rain..."             ∪·

                                                iterα              q1       q2 rain..." α
                                                                           item:"no =
                                                                                                πiter,pos,item,ord:2
                                                                          πiter,pos:1,ord:1, |q1 |·|q2 |
                                                                                                              (since α
                                                                                                                            ∪·
                                                                                                                                                                  α)
    πiter1:iter,                                                                                             πiter,pos:1,ord:3,
                                                                        iter1=iter                                                  item:"chance..."
        pos1:1,item1:50                 item                             πiter,pos,item,ord:2                                                  πiter,pos,item,ord:4

                                                                               iter1=iter                              iter1=iter                    iter=iter1
  πiter                            pos: item     iter
                                                        iterα     Projection path: δ
                                                                                

                        item:descendant::ppcp(item)                item⇒···/descendant::ppcp ∈ path
                                                                                      πiter                                                                ... (q)
                                                                                                                σres
           πiter:inner,item          iter   α
                                                                  Cardinality (uses fanout):
                                                                                         res:(item,item1)


                                                                        item:ax::nt(item) (q) = |q|·Prax::nt (· · · )
                                                                    πiter1:iter,pos,item
                                                                                             iter1=iter   πiter1:iter,
                                                                                                           pos,item
                                                                                                πiter1:iter,
                                                                                                  pos1:1,item1:50            item
                                                                  Domain iter πiter
                                                                      pos: item
                                                                                  inclusion: pos: item iter
                                                                  iterα ∈ dom
                                                                    item:attribute::t(item) ... (q) ; α                α
                                                                                             item:descendant::ppcp(item)

                                                                                                    πiter:inner,item                                       πinner,outer:iter,
                                                card1 : 990                                                                                                  ord:pos

                                                                  Domain size (uses step selectivity):
                                                                               inner: iter,pos


                                                                   α = α 1· Pr[ax::nt] (· · · )
                                                                               pos: item iter

                                                                                            item:descendant::day(item)

                                                                                                         docitem

 a:   Retrieve typed values for node identifiers                                            πiter,item:"forecast.xml"

      in column a (atomization).                                                                             iter
                                                                                                               1
Forecasting New Zealand’s Weather
   The obtained cardinalities can be mapped back to predict item
   counts for corresponding XQuery expressions:4
           for $d in doc ("forecast.xml")/descendant::day card1 = 990/990
           let $day := $d/@t
           let $ppcp := data ($d/descendant::ppcp)
           return
             if ($ppcp > 50)                              card2 = 4104/3402
               then ("rain likely on", $day,            card3 = 3540/2370
                      "chance of precipitation:", $ppcp)
                else ("no rain on", $day) card4 = 564/1032


            Value statistics based on histograms (                        paper)
            Inaccuracy is mainly due to correlations in the data.
                    Rain in the morning likely means rain in the afternoon, too.

       4
           estimated/observed, based on data taken for New Zealand two weeks ago.
  August 26, 2008        Systems Group — Department of Computer Science — ETH Z¨rich
                                                                               u       11
More Realistic Queries: W3C XQuery Use Cases
                                                                 0.01           0.1            1            10
 Prototype implementation                     XMP Q4
 based on Pathfinder                          XMP Q5
                                              XMP Q6
 For each subexpression:                      XMP Q8
                                              XMP Q9
     estimated cardinality                    XMP Q10
                                              XMP Q11
     observed cardinality                     SGML Q3
                                              SGML Q4
 Diameter indicates data                      SGML Q8a
 point “stacking”                             SGML Q8b
                                              R Q2
 Plan root: filled circle                     R Q3
                                              R Q10
                                              R Q11
 Applicable to “real” queries                 R Q13
                                              R Q14
 Recovery from intermediate                   R Q15
 mis-estimations                              R Q17
      e.g., existential semantics                                0.01           0.1            1            10
                                                                    estimated cardinality/observed cardinality

  August 26, 2008    Systems Group — Department of Computer Science — ETH Z¨rich
                                                                           u                               12
Wrap-Up
  Cardinality estimation framework for XQuery
  Subexpression-level estimates for arbitrary XQuery expressions
  Based on Pathfinder’s XQuery-to-relational algebra compiler

                     relational cardinality estimation (System R)
                   + existing work on XPath estimation
                   + histograms for value predicates
                   = cardinality estimation for XQuery


  High-quality estimates for realistic XQuery workloads
      Robust with respect to intermediate errors
  Pluggable and extensible
      e.g., XPath estimation subsystem, positional predicates
 August 26, 2008         Systems Group — Department of Computer Science — ETH Z¨rich
                                                                               u       13

More Related Content

Viewers also liked

Beths Powerpoint
Beths PowerpointBeths Powerpoint
Beths PowerpointJeff Paul
 
50 Words Powerpoint Jayden
50 Words Powerpoint Jayden50 Words Powerpoint Jayden
50 Words Powerpoint Jaydenmrrobbo
 
ECAWA Keynote
ECAWA KeynoteECAWA Keynote
ECAWA Keynotemrrobbo
 
Bankruptcy Information
Bankruptcy InformationBankruptcy Information
Bankruptcy Informationcole collins
 
Tbfs uc commercial presentation
Tbfs uc commercial presentationTbfs uc commercial presentation
Tbfs uc commercial presentationraj638
 
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!Liana Lignou
 
Autobiography Anthony[1]
Autobiography Anthony[1]Autobiography Anthony[1]
Autobiography Anthony[1]aj_matatag
 
Gang announcements 2011 01
Gang announcements 2011 01Gang announcements 2011 01
Gang announcements 2011 01David Giard
 
Make-A-Wish Presentation
Make-A-Wish PresentationMake-A-Wish Presentation
Make-A-Wish Presentationronjm
 
Does the social game business model work for mobile apps?
Does the social game business model work for mobile apps?Does the social game business model work for mobile apps?
Does the social game business model work for mobile apps?Norman Liang
 
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008Eugenio Agnello
 
City University Food Thinkers
City University Food ThinkersCity University Food Thinkers
City University Food Thinkersjackthur
 
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...welshms
 
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008jackthur
 
2009 03 31 Healthstory Webinar Presentation
2009 03 31 Healthstory Webinar Presentation2009 03 31 Healthstory Webinar Presentation
2009 03 31 Healthstory Webinar PresentationNick van Terheyden
 
Gang announcements 2010 09
Gang announcements 2010 09Gang announcements 2010 09
Gang announcements 2010 09David Giard
 
GrouperEye Product Plan
GrouperEye Product PlanGrouperEye Product Plan
GrouperEye Product Planwilliamstj
 
Farm Subsidy Transparency
Farm Subsidy TransparencyFarm Subsidy Transparency
Farm Subsidy Transparencyjackthur
 

Viewers also liked (20)

Beths Powerpoint
Beths PowerpointBeths Powerpoint
Beths Powerpoint
 
50 Words Powerpoint Jayden
50 Words Powerpoint Jayden50 Words Powerpoint Jayden
50 Words Powerpoint Jayden
 
ECAWA Keynote
ECAWA KeynoteECAWA Keynote
ECAWA Keynote
 
Bankruptcy Information
Bankruptcy InformationBankruptcy Information
Bankruptcy Information
 
Tbfs uc commercial presentation
Tbfs uc commercial presentationTbfs uc commercial presentation
Tbfs uc commercial presentation
 
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!
Η συνάρτηση. Μια προξενήτρα αλλοιώτικη από τις άλλες! Function: The matchmaker!
 
Autobiography Anthony[1]
Autobiography Anthony[1]Autobiography Anthony[1]
Autobiography Anthony[1]
 
Gang announcements 2011 01
Gang announcements 2011 01Gang announcements 2011 01
Gang announcements 2011 01
 
GrouperEye
GrouperEyeGrouperEye
GrouperEye
 
Make-A-Wish Presentation
Make-A-Wish PresentationMake-A-Wish Presentation
Make-A-Wish Presentation
 
Does the social game business model work for mobile apps?
Does the social game business model work for mobile apps?Does the social game business model work for mobile apps?
Does the social game business model work for mobile apps?
 
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008
Verifica Di Resistenza Al Fuoco. Nuove Norme Tecniche 2008
 
City University Food Thinkers
City University Food ThinkersCity University Food Thinkers
City University Food Thinkers
 
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...
Transforming HR in an Uncertain Economy: Priorities and Processes That Delive...
 
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
 
2009 03 31 Healthstory Webinar Presentation
2009 03 31 Healthstory Webinar Presentation2009 03 31 Healthstory Webinar Presentation
2009 03 31 Healthstory Webinar Presentation
 
Gang announcements 2010 09
Gang announcements 2010 09Gang announcements 2010 09
Gang announcements 2010 09
 
A Ri Zona Powerpoint
A Ri Zona PowerpointA Ri Zona Powerpoint
A Ri Zona Powerpoint
 
GrouperEye Product Plan
GrouperEye Product PlanGrouperEye Product Plan
GrouperEye Product Plan
 
Farm Subsidy Transparency
Farm Subsidy TransparencyFarm Subsidy Transparency
Farm Subsidy Transparency
 

Similar to Dependable Cardinality Forecast for XQuery

A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
Ml mle_bayes
Ml  mle_bayesMl  mle_bayes
Ml mle_bayesPhong Vo
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Intro to threp
Intro to threpIntro to threp
Intro to threpHong Wu
 
Distance Based Indexing
Distance Based IndexingDistance Based Indexing
Distance Based Indexingketan533
 

Similar to Dependable Cardinality Forecast for XQuery (6)

packet destruction in NS2
packet destruction in NS2packet destruction in NS2
packet destruction in NS2
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Ml mle_bayes
Ml  mle_bayesMl  mle_bayes
Ml mle_bayes
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Intro to threp
Intro to threpIntro to threp
Intro to threp
 
Distance Based Indexing
Distance Based IndexingDistance Based Indexing
Distance Based Indexing
 

More from University of New South Wales (11)

Declarative analysis of noisy information networks
Declarative analysis of noisy information networksDeclarative analysis of noisy information networks
Declarative analysis of noisy information networks
 
InfiniteGraph
InfiniteGraphInfiniteGraph
InfiniteGraph
 
Gremlin
Gremlin Gremlin
Gremlin
 
DHHT - Modeling beyond plain graphs
DHHT - Modeling beyond plain graphsDHHT - Modeling beyond plain graphs
DHHT - Modeling beyond plain graphs
 
Dex
DexDex
Dex
 
Ontological Conjunctive Query Answering over Large Knowledge Bases
Ontological Conjunctive Query Answering over Large Knowledge BasesOntological Conjunctive Query Answering over Large Knowledge Bases
Ontological Conjunctive Query Answering over Large Knowledge Bases
 
Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the CloudKey-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
 
Allegograph
AllegographAllegograph
Allegograph
 
Neo4j
Neo4jNeo4j
Neo4j
 
GraphREL: A Relational Graph Query Processor
GraphREL: A Relational Graph Query ProcessorGraphREL: A Relational Graph Query Processor
GraphREL: A Relational Graph Query Processor
 
XML Compression Benchmark
XML Compression BenchmarkXML Compression Benchmark
XML Compression Benchmark
 

Recently uploaded

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Recently uploaded (20)

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 

Dependable Cardinality Forecast for XQuery

  • 1. Dependable Cardinality Forecasts for XQuery Jens Teubner, ETH (formerly IBM Research) Torsten Grust, U T¨bingen (formerly TUM) u Sebastian Maneth, UNSW and NICTA Sherif Sakr, UNSW and NICTA c Systems Group — Department of Computer Science — ETH Z¨rich u August 26, 2008
  • 2. Cardinality Estimation for XQuery The feature-richness and semantics of the language make cardinality estimation for XQuery notoriously hard. for $d in doc ("forecast.xml")/descendant::day let $day := $d/@t let $ppcp := data ($d/descendant::ppcp) return if ($ppcp > 50) then ("rain likely on", $day, "chance of precipitation:", $ppcp) else ("no rain on", $day) for iteration sequence construction conditionals (if-then-else) ... existential quantification (XPath is not a focus of this work.) August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 2
  • 3. Cardinality Estimation for XQuery The feature-richness and semantics of the language make cardinality estimation for XQuery notoriously hard. for $d in doc ("forecast.xml")/descendant::day card1 = ? let $day := $d/@t let $ppcp := data ($d/descendant::ppcp) return if ($ppcp > 50) card2 = ? then ("rain likely on", $day, card3 = ? "chance of precipitation:", $ppcp) else ("no rain on", $day) card4 = ? for iteration sequence construction conditionals (if-then-else) ... existential quantification (XPath is not a focus of this work.) → Goal: Compute subexpression-level cardinalities cardi . August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 2
  • 4. Idea: Perform cardinality estimation on relational plan equivalents for XQuery. relational cardinality estimation (System R) + existing work on XPath estimation + histograms for value predicates = cardinality estimation for XQuery Build on Pathfinder’s XQuery-to-relational algebra compiler.1 → tuple count ≡ XQuery item count Cardinality information for each subexpression. 1 http://www.pathfinder-xquery.org/ August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 3
  • 5. πiter:outer,pos:pos1,item pos1: ord,pos outer Example (Plan details not of interest today.) 2 iter=inner · ∪ 4 1 3 πiter,pos:pos1,item for $d in doc ("forecast.xml")/descendant::day πiter,pos:pos1,item pos1: ord,pos iter let $day := $d/@t pos1: ord,pos iter · ∪ πiter,pos:1,ord:1, let $ppcp := data ($d/descendant::ppcp) · ∪ item:"rain..." · ∪ return 2 πiter,pos:1,ord:1, πiter,pos,item,ord:2 ∪· if ($ppcp > 50) 3 item:"no rain..." πiter,pos:1,ord:3, item:"chance..." then ("rain likely on", $day, πiter,pos,item,ord:2 πiter,pos,item,ord:4 "chance of precipitation:", $ppcp) iter1=iter iter1=iter iter=iter1 else ("no rain on", $day) 4 δ πiter σres res:(item,item1) Pathfinder compiles XQuery iter1=iter πiter1:iter, πiter1:iter,pos,item with arbitrary nesting. πiter1:iter, pos1:1,item1:50 item pos,item pos: item iter Maintain correspondence to πiter pos: item iter item:attribute::t(item) item:descendant::ppcp(item) original query if back-end is πiter:inner,item πinner,outer:iter, ord:pos not relational. inner: iter,pos 1 pos: item iter Derive estimates based on an item:descendant::day(item) inference rule set. docitem πiter,item:"forecast.xml" iter 1
  • 6. Relational XQuery Cardinality Estimation Apply System R-style estimation to relational XQuery plans, e.g., Disjoint union: Cartesian product: · |q1 ∪ q2 | = |q1 | + |q2 | |q1 × q2 | = |q1 | · |q2 | Equi-join:    |q1 | · |q2 | if there are indexes on  max {|a| , |b| } both join columns,     idx idx  |q1 q2 | = |q1 | · |q2 | if there is only an index a=b |a|idx on column a,        |q1 | · |q2 | · 1/10 otherwise  |c|idx : Number of unique values in index on column c. August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 5
  • 7. Relational XQuery Cardinality Estimation Apply System R-style estimation to relational XQuery plans, e.g., Disjoint union: Cartesian product: · |q1 ∪ q2 | = |q1 | + |q2 | |q1 × q2 | = |q1 | · |q2 | Equi-join:  |q1 | · |q2 | if there are indexes on    max {|a| , |b| }     idx idx both join columns, ?  |q1 q2 | = |q1 | · |q2 | if there is only an index a=b       |a|idx on column a, ?  |q1 | · |q2 | · 1/10 otherwise  Our joins typically operate over computed relations. |c|idx : Number of unique values in index on column c. August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 5
  • 8. Abstract Domain Identifiers A simple form of data flow analysis provides the information needed. Introduce abstract domain identifiers α, β, . . . as placeholders for the active runtime domain for each column c. (Read c α as “column c contains values from domain α.”) Estimate the size α of each domain α, e.g.,2 dom a: b1 ,...,bn (q) ⊇ dom (q) ∪ aα ∧ α =! |q| . Identify inclusion relationships α β between domains, e.g., aα ∈ dom (q) ∧ aβ ∈ dom (σ··· (q)) ⇒ β α . 2 Operator a: b1 ,...,bn introduces a new key column (holding row numbers). August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 6
  • 9. Abstract Domain Identifiers Use abstract domain information for cardinality estimation. E.g., “foreign key” join: aα ∈ dom (q1 ) bβ ∈ dom (q2 ) α β |q1 | · |q2 | |q1 q2 | = a=b β Domain inclusion guarantees that each tuple in q1 finds (at least one) join partner in q2 . Other examples: |q1 q2 | = |q1 | − |q2 | if q2 is a subset of q1 . |q1 q2 | = 0 if q1 is a subset of q2 . |q1 q2 | = |q1 | if q1 and q2 are disjoint. August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 7
  • 10. Interfacing with XPath—Projection Paths Track XPath navigation by means of projection paths3 a a b a b c c:child::*(b) 1 γa γb 1 γa b c d 2 γb 1 γa γc 1 γa γd 3 γd e 3 γd γe q1 q2 b⇒p ∈ path (q1 ) ⇒ c⇒p/child::* ∈ path (q2 ) Step operator makes XPath navigation explicit in relational plans (compiles to join on SQL back-ends). 3 A. Marian and J. Sim´ on. Projecting XML Documents. VLDB 2003. e August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 8
  • 11. Interfacing with XPath—Cardinality Inference bα bα a a b a b c c:child::*(b) 1 γa γb 1 γa b c d 2 γb 1 γa γc α α 1 γa γd 3 γd e 3 γd γe q1 q2 Cardinality: fn:count (p/child::*) |q2 | = |q1 | · fn:count (p) = |q1 | · Prchild::* (p) fanout (here: 4/3) Domain Sizes: fn:count (p [ child::* ]) α = α · fn:count (p) = α · Pr[child::*] (p) selectivity (here: 2/3) Any XPath estimator that provides Prp2 (p1 ) and Pr[p2 ] (p1 ) will do. Our prototype uses a simple Data Guide-based implementation. August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 9
  • 12. πiter:outer,pos:pos1,item pos1: ord,pos outer Back to our Example Plan 2 iter=inner · ∪ 4 πiter,pos:pos1,item 3 πiter,pos:pos1,item pos1: ord,pos iter res:(item,item1) pos1: ord,pos iter · ∪ πiter,pos:1,ord:1, · ∪ · ∪ iter1=iter item:"rain..." πiter,pos,item,ord:2 ∪· πiter,pos:1,ord:1, πiter1:iter, item:"no rain..." πiter,pos:1,ord:3, item:"chance..." pos1:1,item1:50 item πiter,pos,item,ord:2 πiter,pos,item,ord:4 iter1=iter iter1=iter iter=iter1 πiter pos: item iter δ item:descendant::ppcp(item) πiter σres πiter:inner,item res:(item,item1) iter1=iter πiter1:iter, πiter1:iter,pos,item pos,item πiter1:iter, pos1:1,item1:50 item pos: item iter πiter pos: item iter item:attribute::t(item) item:descendant::ppcp(item) πiter:inner,item πinner,outer:iter, ord:pos inner: iter,pos 1 pos: item iter item:descendant::day(item) docitem a: Retrieve typed values for node identifiers πiter,item:"forecast.xml" in column a (atomization). iter 1
  • 13. πiter:outer,pos:pos1,item pos1: ord,pos outer Back to our Example Plan 2 iter=inner · ∪ 4 πiter,pos:pos1,item 3 πiter,pos:pos1,item pos1: ord,pos iter res:(item,item1) pos1: ord,pos iter · ∪ πiter,pos:1,ord:1, · ∪ · ∪ iter1=iter item:"rain..." πiter,pos,item,ord:2 ∪· πiter,pos:1,ord:1, πiter1:iter, item:"no rain..." πiter,pos:1,ord:3, item:"chance..." pos1:1,item1:50 item πiter,pos,item,ord:2 πiter,pos,item,ord:4 iter1=iter iter1=iter iter=iter1 πiter pos: item iter Projection path: δ item:descendant::ppcp(item) item⇒···/descendant::ppcp ∈ path πiter ... (q) σres πiter:inner,item Cardinality (uses fanout): res:(item,item1) item:ax::nt(item) (q) = |q|·Prax::nt (· · · ) πiter1:iter,pos,item iter1=iter πiter1:iter, pos,item πiter1:iter, pos1:1,item1:50 item pos: item iter πiter pos: item iter item:attribute::t(item) item:descendant::ppcp(item) πiter:inner,item πinner,outer:iter, ord:pos inner: iter,pos 1 pos: item iter item:descendant::day(item) docitem a: Retrieve typed values for node identifiers πiter,item:"forecast.xml" in column a (atomization). iter 1
  • 14. πiter:outer,pos:pos1,item pos1: ord,pos outer Back to our Example Plan 2 iter=inner · ∪ 4 πiter,pos:pos1,item 3 πiter,pos:pos1,item pos1: ord,pos iter res:(item,item1) pos1: ord,pos iter · ∪ πiter,pos:1,ord:1, · ∪ · ∪ iter1=iter item:"rain..." πiter,pos,item,ord:2 ∪· πiter,pos:1,ord:1, πiter1:iter, item:"no rain..." πiter,pos:1,ord:3, item:"chance..." pos1:1,item1:50 item πiter,pos,item,ord:2 πiter,pos,item,ord:4 iter1=iter iter1=iter iter=iter1 πiter pos: item iter iterα Projection path: δ item:descendant::ppcp(item) item⇒···/descendant::ppcp ∈ path πiter ... (q) σres πiter:inner,item iter α Cardinality (uses fanout): res:(item,item1) item:ax::nt(item) (q) = |q|·Prax::nt (· · · ) πiter1:iter,pos,item iter1=iter πiter1:iter, pos,item πiter1:iter, pos1:1,item1:50 item Domain iter πiter pos: item inclusion: pos: item iter iterα ∈ dom item:attribute::t(item) ... (q) ; α α item:descendant::ppcp(item) πiter:inner,item πinner,outer:iter, ord:pos Domain size (uses step selectivity): inner: iter,pos α = α 1· Pr[ax::nt] (· · · ) pos: item iter item:descendant::day(item) docitem a: Retrieve typed values for node identifiers πiter,item:"forecast.xml" in column a (atomization). iter 1
  • 15. πiter:outer,pos:pos1,item pos1: ord,pos outer Back to our Example Plan 2 iter=inner · ∪ 4 πiter,pos:pos1,item 3 πiter,pos:pos1,item pos1: ord,pos iter res:(item,item1) pos1: ord,pos iter · ∪ π iter1α iter1=iter “Foreign key” iter,pos:1,ord:1, ∪· join: item:"rain..." ∪· iterα q1 q2 rain..." α item:"no = πiter,pos,item,ord:2 πiter,pos:1,ord:1, |q1 |·|q2 | (since α ∪· α) πiter1:iter, πiter,pos:1,ord:3, iter1=iter item:"chance..." pos1:1,item1:50 item πiter,pos,item,ord:2 πiter,pos,item,ord:4 iter1=iter iter1=iter iter=iter1 πiter pos: item iter iterα Projection path: δ item:descendant::ppcp(item) item⇒···/descendant::ppcp ∈ path πiter ... (q) σres πiter:inner,item iter α Cardinality (uses fanout): res:(item,item1) item:ax::nt(item) (q) = |q|·Prax::nt (· · · ) πiter1:iter,pos,item iter1=iter πiter1:iter, pos,item πiter1:iter, pos1:1,item1:50 item Domain iter πiter pos: item inclusion: pos: item iter iterα ∈ dom item:attribute::t(item) ... (q) ; α α item:descendant::ppcp(item) πiter:inner,item πinner,outer:iter, ord:pos Domain size (uses step selectivity): inner: iter,pos α = α 1· Pr[ax::nt] (· · · ) pos: item iter item:descendant::day(item) docitem a: Retrieve typed values for node identifiers πiter,item:"forecast.xml" in column a (atomization). iter 1
  • 16. πiter:outer,pos:pos1,item card2 : 4104 pos1: ord,pos outer Back to our Example Plan 2 iter=inner card3 : 3540 · ∪ 4 card4 : 564 πiter,pos:pos1,item 3 πiter,pos:pos1,item pos1: ord,pos iter res:(item,item1) pos1: ord,pos iter · ∪ π iter1α iter1=iter “Foreign key” iter,pos:1,ord:1, ∪· join: item:"rain..." ∪· iterα q1 q2 rain..." α item:"no = πiter,pos,item,ord:2 πiter,pos:1,ord:1, |q1 |·|q2 | (since α ∪· α) πiter1:iter, πiter,pos:1,ord:3, iter1=iter item:"chance..." pos1:1,item1:50 item πiter,pos,item,ord:2 πiter,pos,item,ord:4 iter1=iter iter1=iter iter=iter1 πiter pos: item iter iterα Projection path: δ item:descendant::ppcp(item) item⇒···/descendant::ppcp ∈ path πiter ... (q) σres πiter:inner,item iter α Cardinality (uses fanout): res:(item,item1) item:ax::nt(item) (q) = |q|·Prax::nt (· · · ) πiter1:iter,pos,item iter1=iter πiter1:iter, pos,item πiter1:iter, pos1:1,item1:50 item Domain iter πiter pos: item inclusion: pos: item iter iterα ∈ dom item:attribute::t(item) ... (q) ; α α item:descendant::ppcp(item) πiter:inner,item πinner,outer:iter, card1 : 990 ord:pos Domain size (uses step selectivity): inner: iter,pos α = α 1· Pr[ax::nt] (· · · ) pos: item iter item:descendant::day(item) docitem a: Retrieve typed values for node identifiers πiter,item:"forecast.xml" in column a (atomization). iter 1
  • 17. Forecasting New Zealand’s Weather The obtained cardinalities can be mapped back to predict item counts for corresponding XQuery expressions:4 for $d in doc ("forecast.xml")/descendant::day card1 = 990/990 let $day := $d/@t let $ppcp := data ($d/descendant::ppcp) return if ($ppcp > 50) card2 = 4104/3402 then ("rain likely on", $day, card3 = 3540/2370 "chance of precipitation:", $ppcp) else ("no rain on", $day) card4 = 564/1032 Value statistics based on histograms ( paper) Inaccuracy is mainly due to correlations in the data. Rain in the morning likely means rain in the afternoon, too. 4 estimated/observed, based on data taken for New Zealand two weeks ago. August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 11
  • 18. More Realistic Queries: W3C XQuery Use Cases 0.01 0.1 1 10 Prototype implementation XMP Q4 based on Pathfinder XMP Q5 XMP Q6 For each subexpression: XMP Q8 XMP Q9 estimated cardinality XMP Q10 XMP Q11 observed cardinality SGML Q3 SGML Q4 Diameter indicates data SGML Q8a point “stacking” SGML Q8b R Q2 Plan root: filled circle R Q3 R Q10 R Q11 Applicable to “real” queries R Q13 R Q14 Recovery from intermediate R Q15 mis-estimations R Q17 e.g., existential semantics 0.01 0.1 1 10 estimated cardinality/observed cardinality August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 12
  • 19. Wrap-Up Cardinality estimation framework for XQuery Subexpression-level estimates for arbitrary XQuery expressions Based on Pathfinder’s XQuery-to-relational algebra compiler relational cardinality estimation (System R) + existing work on XPath estimation + histograms for value predicates = cardinality estimation for XQuery High-quality estimates for realistic XQuery workloads Robust with respect to intermediate errors Pluggable and extensible e.g., XPath estimation subsystem, positional predicates August 26, 2008 Systems Group — Department of Computer Science — ETH Z¨rich u 13