SlideShare una empresa de Scribd logo
1 de 51
Descargar para leer sin conexión
Update Statistics

                                Olivier Bourdin
                    olivier.bourdin@fr.ibm.com


                    Mercredi 3 Octobre 2012
                     User Group Informix France
Overview
 Brief Review and History
 What’s changed?
 – 11.10, 11.50
 – 11.70 – “Smart Statistics”
 11.70 FAQ’s
 – Do I need to do anything different?
 – Did the update statistics update any stats?
 – Update statistics and reoptimization


                           User Group Informix France
Why is statistics important?

 Choosing the right QUERY PATH determines how fast you get
your results.
 Choosing the Wrong Path can be like going around the world to
get to your neighbor’s.


           • Expensive to go around the world.
           • Takes too long.




                                       User Group Informix France
Query Optimization Process

  Examine all tables (table A, table B, table C)
   – Examine selectivity of every filter (where clauses)
   – Determine if indexes can be used for filters, order by, group by
   – Find the best way to scan a table -- sequentially or by an index
  Identify Join Pairs (AB, AC, BA, BC, CA, CB)
   – Find best join method (nested loop, hash, or sort merge)
   – Decide which indexes are best for the join
   – Calculate the cost of the join
  Repeat for each additional table (ABC, ACB,
  BAC, ...)

                                       User Group Informix France
Estimating costs: need data !

   Find the cheapest/lowest cost path.
   – Cost = I/O cost + Weight * (CPU cost)
   – I/O -- disk access
   – CPU -- Rows processed
   Estimate costs
   – Filters -- Which indexes to use?
   – Joins -- Nested Loop, Hash, or Sort Merge?
   – Eliminate redundant pairs?



                                User Group Informix France
Filter selectivity

   Selectivity is the percentage of rows selected as a result of
   a filter (number between 0 and 1)

     Expression              Filter Selectivity
     indexed_col = literal   F=1/(number of distinct keys in index)
     value
     indexed_col > literal   F = (literal value - 2nd min)/(2nd max-2nd
     value                   min)
     NOT expression          F = 1 - F(expression)
     expr1 AND expr2         F = F(expr1) x F(expr2)




                                          User Group Informix France
How do we influence Quey
Optimization ?
  OPTCOMPIND
  Optimizer directives, Optimization Goals
  Update Statistics
   – Collect information for the optimizer
   – Table nrows, npused; Index Statistics -- LOW
   – Data Distributions -- MEDIUM & HIGH
   – Compile Stored Procedures



                             User Group Informix France
Where are the stats stored ?

  systables (Low)
   – nrows, npused
  sysindices (Low)
   – leaves, levels, nunique, clust
  syscolumns (Low)
   – colmin, colmax
  sysfragments (Low)
   – nrows, npused,
   – For index partitions, levels, clust        Can view with
  sysdistrib (Medium or High)                   dbschema -hd




                                       User Group Informix France
View Query Path

  Set explain on
   – Can be set in session
  Explain Directive
   – Can be embedded in the query
     FOREACH SELECT {+EXPLAIN } order_num INTO p_num
          FROM orders
          WHERE customer_num = 104 ORDER BY order_num

  xtrace Debug
   – Support may ask you to turn this on


                                 User Group Informix France
Debugging with xtrace

  To “see” the statistics information being used for query
  optimization
         Example:
         xtrace heavy -c XTF_OPTMZR -f XTF_DEBUG
         xtrace size 10000
         xtrace on


         Use “xtrace fview” or     Use “xtrace info” to
         “xtrace view” to view     display current xtrace
         traces.                   settings.

         “xtrace fview” includes   Use “xtrace --” for xtrace
         timestamps.               usage info.


                                     User Group Informix France
Xtrace: example
 f1 31310 16   get_distrib(): distrib not found for table c col zipcode          Before
 f1 7401 16    selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0434783 …
 …
 f2 1207 16    oprowspages(tab = c, nrows = 28, npages = 2)
 f2 13217 16   opmix_iscancost(numrows=1.21739,npages=2,pagesread=1.13988)
 f2 13225 16   opmix_iscancost(scancost=1.1764,indexcost=1.08, …, iscancost=2.2564)



 f1 31310 18 get_distrib(): distrib found for table c col zipcode              After Update
 f1 7401 18 selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0357143 …             Statistics
 …
 f2 1207 18 oprowspages(tab = c, nrows = 28672, npages = 2048)
 …
 f2 2237 18 dpages = 24576 lpages = 84 nlevels = 2
 f2 1871 18 dcost = 33.72 seek 0 keyonly = TRUE
 f2 1896 18 iscancost(c, zip_ix) cost = 35.72
 f2 13217 18 opmix_iscancost(numrows=1024,npages=2048,pagesread=805.977)
 f2 13225 18 opmix_iscancost(scancost=836.697,indexcost=35.72, …, iscancost=872.417)



                                                         User Group Informix France
Xtrace (after ... cont’d)

  …
  f2 1207 18    oprowspages(tab = c, nrows = 28672, npages = 2048)
  f2 1320 18    opscantabcost(c) npages = 2048, nrows = 28672, cost = 2909.16
  f2 1527 18    opcartcost(c) cost = 2909.16 initcost = 0
  f2 1988 18    index_info(): index 100_1 fullness 0.75 recs_per_node 128 keylen 4
  …
  f2 2237 18    dpages = 2048 lpages = 187 nlevels = 3
  f2 10863 18   idxtree_travcost s 3.48772e-05 nlevels 3 lpages .. dpages .. mempages 512
  f2 14448 18        seek_factor 6 clust 2048 clust_scale 0 seek 0
  …
  f2 1727 18    opidxcost(c, 100_1) = 0.745763
  f1 16094 18   index 100_1 considered, icost 0.745763, istart 0.0078125, fltragg 0
  f1 16324 18   indexp(): best index path: idx 100_1 icost = 0.745763 idx_flags 2
  f3 3462 18    idx cost = 0.745763 initcost = 0.0078125 totalcost = 17.1526
  f3 3465 18    outer size = 23 join size = 1
  f3 8468 18    build inner table, init cost is 13.5745, join cost is 4.24268
  f3 8568 18    build outer table, init cost is 4.24268, join cost is 13.5745



                                                          User Group Informix France
sqexplain.out (before)

  select c.city, c.state, o.ship_date from customer c, orders o
  where c.customer_num = o.customer_num and c.state = ? and
  c.zipcode = ?

  Estimated Cost: 3
  Estimated # of Rows Returned: 1

    1) informix.c: INDEX PATH
          Filters: informix.c.state = 'AZ'
      (1) Index Name: informix.zip_ix
          Index Keys: zipcode   (Serial, fragments: ALL)
          Lower Index Filter: informix.c.zipcode = '85016'

    2) informix.o: INDEX PATH
      (1) Index Name: informix. 102_4
          Index Keys: customer_num    (Serial, fragments: ALL)
          Lower Index Filter: informix.c.customer_num =
  informix.o.customer_num
  NESTED LOOP JOIN


                                        User Group Informix France
sqexplain.out (after)

  select c.city, c.state, o.ship_date from customer c, orders o
  where c.customer_num = o.customer_num and c.state = ? and
  c.zipcode = ?

  Estimated Cost: 19                          Customer has 28672 rows.
  Estimated # of Rows Returned: 1
                                              Orders has 23 rows.
    1) informix.o: SEQUENTIAL SCAN
    2) informix.c: INDEX PATH

          Filters: (informix.c.zipcode = '85016' AND
  informix.c.state = 'AZ' )

      (1) Index Name: informix. 100_1
          Index Keys: customer_num    (Serial, fragments: ALL)
          Lower Index Filter: informix.c.customer_num =
  informix.o.customer_num
  NESTED LOOP JOIN



                                        User Group Informix France
Before 11.x

   Before 11.x
   – Update statistics low,
   – Update statistics medium, high
        • Resolution, Confidence
                                                          Scripts
   –   Update statistics distributions only
                                                          Cron jobs
   –   Update statistics drop distributions
   –   Update statistics for table, for procedure
   –   Lots of guidelines
        • What to run update statistics on
        • Which update statistics to run
        • How to run update statistics


                                             User Group Informix France
Guidelines

  Update statistics medium distributions only for all columns
  that do not have an index
  Update statistics high for columns that are the first key in an
  index
  Update statistics low for all columns in multicolumn indexes
  Run with PDQ for better performance (for table ONLY)
  Do not run with PDQ for update statistics for procedure




                                      User Group Informix France
Issues (before 11.x)

   Difficult to know when update statistics was run last
   Guidelines weren’t always well-understood
   People weren’t sure how to run update statistics
    – Accidentally over-wrote statistics by running HIGH first,
      then MEDIUM
    – Accidentally compiled stored procedures with PDQ
    – Ran Update Stats LOW twice (performance issue)

          Update statistics LOW for table tab1;                 What might be
                                                                considered
          Update statistics HIGH for table tab1 (col1, col2);   “missing” here?



                                                  User Group Informix France
11.10 Features

  11.10 Enhancements
   – Create index creates initial stats and distribution
     information for the leading column of the index
   – Enhance catalog information
      • What time was update statistics Low run?
      • What time were the distributions created?
      • How many rows were sampled for the distributions?
   – New “Sampling Size” option
   – Update statistics drop distributions ONLY
   – Auto Update Statistics Scheduler tasks



                                        User Group Informix France
Help with Guidelines

  Use scheduler task “Auto Update Statistics
  Evaluation”
   – Scheduler task can be run “on-demand” using exectask()
     Execute function exectask(‘Auto Update Statistics Evaluation’)

  Use script in Informix Technote (swg21137764)
   – UPDATE STATISTICS commands to allow the optimizer
     to work its best
        http://www-01.ibm.com/support/docview.wss?uid=swg21137764


  Use Art Kagel’s dostats (from IIUG)
                                            User Group Informix France
US History

  First introduced in 11.10
   – Scheduler task “Auto Update Statistics Evaluation”
   – Scheduler task “Auto Update Statistics Refresh”
   – Uses the guidelines to determine the update statistics
     commands to run
  Enhancement to work with non-English Locales in
  11.50.xC6




                                    User Group Informix France
AUS Scheduler Tasks
  Runs Update Statistics FOR TABLE commands

 UPDATE STATISTICS LOW FOR TABLE stores7:customer
 UPDATE STATISTICS HIGH FOR TABLE stores7:customer (
 customer_num, zipcode ) RESOLUTION 0.500 DISTRIBUTIONS ONLY



  Runs with PDQ set to AUS_PDQ in sysadmin:ph_threshold

 > select * from ph_threshold where name = "AUS_PDQ";
 id           30
 name         AUS_PDQ
 task_name    Auto Update Statistics Refresh
 value        10
 value_type   NUMERIC
 description Update statistics executes with this PDQ priority.


                                      User Group Informix France
AUS Parameters

  AUS_AGE            aus_evaluator
                     The statistics are rebuilt after specified days.
  AUS_CHANGE         aus_evaluator
                     The statistics are rebuilt after specified percentage
                     of data has changed.
  AUS_AUTO_RULES     aus_evaluator
                     1 or 0 – if “off”, only evaluates tables that already
                     have statistics.
  AUS_SMALL_TABLES   aus_evaluator
                     Tables containing less than this number of rows will
                     always have their statistics rebuilt.
  AUS_PDQ            aus_refresh_stats
                     Run Update Statistics with this PDQ setting.


                                           User Group Informix France
11.70 Features

  Smart Statistics
   – Default: AUTO_STAT_MODE 1
   – Default: STATCHANGE 10
   – Update Statistics command, when run, is not executed
     for index statistics and for table distribution if the
     STATCHANGE threshold has not been met

  Fragment-level Statistics
   – Not on by default
   – Not discussed in this presentation



                                    User Group Informix France
11.70 Statistics Updated ?

  Update Statistics info in database catalog
  tables
  –Look at ustlowts in systables
   • Updated when systables' nrows and npused are updated
     – this is done whenever update statistics command is run
     – STATCHANGE threshold is not looked at
  –Look at ustlowts in sysindices
   • Updated when index statistics are rebuilt/updated
  –Look at constr_time in sysdistrib
   • Updated when distribution statistics are rebuilt/updated


                                     User Group Informix France
Example

  $ dbaccessdemo7 stores7 –nots

  select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts, ustlowts
  from sysindices where tabid = 100 and idxname = “zip_ix” ;

  idxname        zip_ix   Index on customer(zipcode)
  levels         1
  leaves         1.000000000000
  nrows          28.00000000000                UDI counters for this index
  nupdates       0.00                          at the time of the update
  ndeletes       0.00                          statistics low run.
  ninserts       28.00000000000
  ustlowts       2012-04-03 22:54:56.00000


  > select * from sysdistrib where tabid = 100;               dbaccessdemo7 did not
                                                              create table distributions
  No rows found.                                              for customer table.

                                                      User Group Informix France
Example (cont’d)
  > load from customer.unl insert into customer;

  199863 row(s) loaded.

  > select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts,
  > ustlowts from sysindices where tabid = 100 and idxname = “zip_ix”;

  idxname       zip_ix
  levels        1                       Index statistics for zip_ix
  leaves        1.000000000000          unchanged after 199,863
  nrows         28.00000000000          rows inserted into the
  nupdates      0.00                    customer table.
  ndeletes      0.00
  ninserts      28.00000000000          -- No update statistics
  ustlowts                              command has been run.
                2012-04-03 22:54:56.00000



                                               User Group Informix France
Example (cont’d)

  > create index state_ix on customer(state);

  idxname   zip_ix                          idxname   state_ix
  levels    1                               levels    3
  leaves    1.000000000000                  leaves    556.0000000000
  nrows     28.00000000000                  nrows
  nupdates 0.00                             nupdates 0.00
  ndeletes 0.00                             ndeletes 0.00
  ninserts 28.00000000000                   ninserts 0.00
  ustlowts 2012-04-03                       ustlowts 2012-04-03
  22:54:56.00000                            23:04:33.00000

      After inserting 199,863 rows into the customer table, create index
      state_ix on customer(state).
      -- No update statistics command has been run.
                                                User Group Informix France
Example (cont’d)

     > select tabid, colno, mode, smplsize, rowssmpld, constr_time,
     > ustnrows, ustbuildduration, nupdates, ndeletes, ninserts
     > from sysdistrib where tabid = 100;

     tabid                    100
     colno                    8     column state
     mode                     H
     smplsize                 199891.0000000
     rowssmpld                199891.0000000
     constr_time              2012-04-03 23:04:33.00000
     ustnrows                 199891.0000000
     ustbuildduration           0:00:00.00000 Distribution
     nupdates                 0.00               information for
     ndeletes                 0.00               column state in
     ninserts                 199891.0000000 customer table


                                            User Group Informix France
Example (cont’d)

  > select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr
  > where partnum in (select partn from sysfragments
  >                   where fragtype = "I" and indexname in ('state_ix', 'zip_ix'));

                 partnum          nupdates        ndeletes              ninserts
   zip_ix       1049092             0               0                   199891
   state_ix     1049100             0               0                        0

   > select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr
   > where partnum = (select partnum from systables where tabid = 100);

                 partnum         nupdates         ndeletes            ninserts
   customer       1049069            0                  0               199891


            Actual partition page info, showing the UDI counters for the partition, since
            the partition was created – this is not the same as the UDI info in the catalogs,
            which are updated when statistics are updated.


                                                     User Group Informix France
OAT view of Statistics




                         User Group Informix France
OAT view (cont’d)




                    For customer table --
                    • Index zip_ix has exceeded STATCHANGE.
                    • Index state_ix has not.

                                 User Group Informix France
Example (cont’d)

   > update statistics low for table customer;

   idxname   zip_ix BEFORE                   idxname   zip_ix AFTER
   levels    1                               levels    3
   leaves    1.000000000000                  leaves    505.0000000000
   nrows     28.00000000000                  nrows     199891.0000000
   nupdates 0.00                             nupdates 0.00
   ndeletes 0.00                             ndeletes 0.00
   ninserts 28.00000000000                   ninserts 199891.0000000
   ustlowts 2012-04-03                       ustlowts 2012-04-04
   22:54:56.00000                            00:36:53.00000


                                             • Index statistics updated.
            zip_ix index                     • Catalog UDI values updated.
                                             • sysindices ustlowts updated.
                                                 User Group Informix France
Example (cont’d)

  > update statistics low for table customer;
                               BEFORE                                      AFTER
  idxname   state_ix                            idxname   state_ix
  levels    3                                   levels    3
  leaves    556.0000000000                      leaves    556.0000000000
  nrows                                         nrows     199891.0000000
  nupdates 0.00                                 nupdates 0.00
  ndeletes 0.00                                 ndeletes 0.00
  ninserts 0.00                                 ninserts 0.00
  ustlowts 2012-04-03                           ustlowts 2012-04-03
  23:04:33.00000                                23:04:33.00000

                                           • Index statistics unchanged.
           state_ix index                  • Catalog UDI values unchanged.
                                           • sysindices ustlowts unchanged.

                                                 User Group Informix France
Example (cont’d)

      > select tabname, tabid, nrows, created, ustlowts
      > from systables where tabid = 100;


      tabname        customer
      tabid          100
      nrows          199891.0000000
      created        04/03/2012
      ustlowts       2012-04-04 00:36:53.00000


               The systables information is always updated when update
               statistics for table stats are run, regardless of
               STATCHANGE.



                                             User Group Informix France
Example

    Update Statistics LOW for table tab1;
    Update Statistics HIGH for table tab1 (col1, col2);


    Before 11.70
     – You should put “Distributions Only” in the Update
       Statistics HIGH command to avoid collecting index
       statistics again
    After 11.70
     – Doesn’t matter since index statistics will only be
       updated if STATCHANGE has been met for the
       index


                                            User Group Informix France
Sysmaster query for %change

  SELECT colname as name, 'Column' as type, constr_time::datetime year to second as build_date,
  rowssmpld::bigint as sample, d.ustnrows::bigint as nrows,
  case when d.mode = 'M' then 'Medium‘ when d.mode = 'H' then 'High' end as mode,
  resolution, confidence, ustbuildduration as build_duration,
  (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) as udi_counter,
  CASE WHEN d.ustnrows=0 and
    (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) = 0 THEN 0.00
         WHEN d.ustnrows=0 and
    (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) != 0 THEN -1
  ELSE ROUND((table_counter.udi_counter - d.ninserts - d.nupdates –
                   d.ndeletes)/d.ustnrows * 100,2)
  END as change
  FROM sysdistrib d, syscolumns c,
  ( select SUM(nupdates + ndeletes + ninserts) as udi_counter from sysmaster:sysptnhdr
      where partnum in (select partn from sysfragments where tabid = 100 and fragtype='T'
      union select partnum as partn from systables where tabid = 100) ) as table_counter
  WHERE d.tabid=100 and c.tabid=100 and d.colno = c.colno and d.seqno = 1

    UNION


                                                        User Group Informix France
Sysmaster query for %change

  -- Continuing query started on previous slide
  SELECT idxname as name, MIN('Index') as type, MIN(ustlowts)::datetime year to second as
  build_date, MIN(0) as sample, SUM(f.nrows)::bigint as nrows, MIN('Low') as mode,
  MIN(0) as resolution, MIN(0) as confidence, SUM(i.ustbuildduration) as build_duration,
  SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) -
  SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)) as udi_counter,
  CASE WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0)
   + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) = 0
  THEN 0.00
        WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0)
   + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) != 0
  THEN -1
  ELSE ROUND((SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0))
    - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)))/SUM(f.nrows) * 100,2)
  END as change
  FROM sysindices i, sysmaster:sysptnhdr p, sysfragments f
  WHERE i.idxname = f.indexname
               AND i.tabid = 100 AND i.tabid = f.tabid AND f.partn = p.partnum
  GROUP BY i.idxname ORDER BY change DESC


                                                       User Group Informix France
Table STATCHANGE value

   Default STATCHANGE applies if not set for table
   Can be set at session level using set
   environment
    – Set environment statchange ‘5’ ;
   Can set STATCHANGE when creating table
   Can alter table to set STATCHANGE
    – Alter table customer statchange 5;

   select tabname, NVL ( statchange, (select cf_effective from
   sysmaster:sysconfig where cf_name = ‘STATCHANGE’) ) as statchange
   from systables where tabname = "customer";


                                        User Group Informix France
FORCE option

   Can add “FORCE” to any update statistics
   command to ignore STATCHANGE
   When you upgrade to 11.70
    – Existing partition pages will have UDI counters added
      (UDI values are 0)
    – Catalog tables sysfragments (for indexes) and
      sysdistrib (for table column data distributions) will
      have UDI counters added (values are 0)
    – What does this mean for Update Statistics?
       • FORCE  Execute even if NO change
       • STATCHANGE 0    Execute if any amount of change (non-
         zero)


                                    User Group Informix France
FORCE option (cont’d)

     Add “FORCE” to end of update statistics
     command to get legacy behavior (ignore
     STATCHANGE)
     FORCE
     – Execute even if NO change
     – Sets sysdistrib nupdates, ndeletes, ninserts to 0 –
       same behavior isn’t seen with sysfragments
       nupdates, ndeletes, ninserts
     STATCHANGE 0
     – Execute if non-zero amount of change
     – Set environment STATCHANGE ‘0’


                                   User Group Informix France
Stored Procedures

   Not affected by STATCHANGE -- Update
   statistics FOR PROCEDURE
   SQL statements in SPL are optimized
    – When SPL is created or on first execution
    – When dependent table or indexes are altered
    – When statistics of dependent tables change

       In 11.70, this means every time update statistics is run to update a
       table, systable’s npused, nrows, and ustlowts are updated (even if
       index statistics or distribution statistics are not updated due to
       STATCHANGE not having been met).




                                             User Group Informix France
Update Statistics Low - Summary

  Update statistics low performance improvement feature takes effect
  when :
      • USTLOW_SAMPLE is set to 1
      • the index has 100,000 or more leaf pages
      • Detached index


  USTLOW_SAMPLE
      • New ONCONFIG parameter, documented in 11.70.xC4
      • Controls use of sampling (new feature) to collect index statistics during
        update statistics
      • 0 or 1 (on) / Default value is 0 (off)
      • Can be updated with onmode -wm/wf
      • Can be set at session-level using SET ENVIRONMENT
          – Set Environment USTLOW_SAMPLE '0' / '1' / 'on' / 'off'


                                               User Group Informix France
Update Statistics Low – Why?

  Update Statistics LOW takes too long when gathering statistics for large
  indexes
       • Entire index is read in sequence
       • Each leaf page of an index must be read individually (separate I/O)
       • Some customers do not run the command because it does not fit in the
         maintenance window
       • On a single large table (billions of rows and many indexes), command
         can take over 3 days


  New Feature Solution: USTLOW_SAMPLE
       • Use sampling to reduce time required to gather index statistics
       • Many samples are taken, and index statistics is calculated based on
         statistics from the samples




                                              User Group Informix France
Update Statistics Low - Details

   Update statistics low gathers the following index statistics
        •   number of index levels
        •   number of index leaf pages
        •   number of unique values for index lead key
        •   clustering factor
        •   2nd lowest and 2nd highest value for index lead key


   Index statistics saved in database catalog
        • Sysindices (levels, leaves, nunique, clust)
        • Syscolumns (colmin, colmax)
        • Sysfragments (levels, clust) for fragtype = “I”


   When Update Statistics Med or High is run, index statistics are also
   collected, unless “Distributions Only” is used


                                                 User Group Informix France
Update Statistics Low – Details (cont’d)

   Instead of reading the entire index in sequence, the new feature:

        • Uses sampling
        • Each sample will go from index root page to index leaf page,
          reading one or more index leaf pages
        • Sampling is “dynamic” -- number of samples is not pre-
          determined
        • Number of samples is determined by the quality of the samples
            – Fewer samples needed if data is evenly distributed
            – More samples needed if data distribution is skewed
            – Standard deviation among the samples is used as criteria as a
              measurement of “quality”


        • Time for update statistics is not predictable up-front


                                                  User Group Informix France
Update Statistics Low - Example

  Example based on internal traces




                                     User Group Informix France
Update Statistics Low - Example

  Example based on internal traces




                                     User Group Informix France
Update Statistics Low - Notes
   Review of Update statistics feature
    – 11.70.xC1 “Smart Statistics” Feature Review
       • Default: AUTO_STAT_MODE 1
       • Default: STATCHANGE 10
       • Update Statistics command, when run, is not executed for index statistics and for
         table distribution if the STATCHANGE threshold has not been met
   – Update Statistics info in database catalog tables
       • Look at ustlowts in systables
            – Updated when systables' nrows and npused are updated – this is done
              whenever update statistics command is run – STATCHANGE threshold is
              not looked at
       • Look at ustlowts in sysindices
            – Updated when index statistics are rebuilt/updated
       • Look at constr_time in sysdistrib
            – Updated when distribution statistics are rebuilt/updated
   Remember, 11.10 Feature – Statistics are collected when Index is
   created
                                                    User Group Informix France
Catalog for smarter Statistics

  systables          sysfragments              11.70
  statchange         nupdates                  Existing
  statlevel          ndeletes
  ustlowts           ninserts



  sysindices         sysdistrib          sysfragdist
  nupdates            nupdates           nupdates
  ndeletes            ndeletes           ndeletes
  ninserts            ninserts           ninserts
  ustbuildduration    ustbuildduration   ustbuildduration
  ustlowts            constr_time        constr_time



                                    User Group Informix France
Questions ?




              User Group Informix France
Merci

                Olivier Bourdin
    olivier.bourdin@fr.ibm.com


    Mercredi 3 Octobre 2012
        User Group Informix France

Más contenido relacionado

Similar a Ugif 10 2012 ppt0000002

Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
David Ritchie
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
Positive Hack Days
 
References1. HCS 2010 online manuals.2. Data Data provi.docx
References1. HCS 2010 online manuals.2. Data  Data provi.docxReferences1. HCS 2010 online manuals.2. Data  Data provi.docx
References1. HCS 2010 online manuals.2. Data Data provi.docx
debishakespeare
 

Similar a Ugif 10 2012 ppt0000002 (20)

Minitab capabilityformulas
Minitab capabilityformulasMinitab capabilityformulas
Minitab capabilityformulas
 
Datapath Design of Computer Architecture
Datapath Design of Computer ArchitectureDatapath Design of Computer Architecture
Datapath Design of Computer Architecture
 
Unsupervised learning networks
Unsupervised learning networksUnsupervised learning networks
Unsupervised learning networks
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
Deep Turnover Forecast - meetup Lille
Deep Turnover Forecast - meetup LilleDeep Turnover Forecast - meetup Lille
Deep Turnover Forecast - meetup Lille
 
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA   New Questions 29Tuts.Com New CCNA 200-120 New CCNA   New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
 
CapSense Data Viewing Tool - Design Aids
CapSense Data Viewing Tool - Design AidsCapSense Data Viewing Tool - Design Aids
CapSense Data Viewing Tool - Design Aids
 
Module_01.ppt
Module_01.pptModule_01.ppt
Module_01.ppt
 
(ATS3-APP07) Isentris integration with the Accelrys Enterprise Platform
(ATS3-APP07) Isentris integration with the Accelrys Enterprise Platform(ATS3-APP07) Isentris integration with the Accelrys Enterprise Platform
(ATS3-APP07) Isentris integration with the Accelrys Enterprise Platform
 
Problem-solving and design 1.pptx
Problem-solving and design 1.pptxProblem-solving and design 1.pptx
Problem-solving and design 1.pptx
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Cis 115 Extraordinary Success/newtonhelp.com
Cis 115 Extraordinary Success/newtonhelp.com  Cis 115 Extraordinary Success/newtonhelp.com
Cis 115 Extraordinary Success/newtonhelp.com
 
Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow Analysis
 
Monomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMonomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted Data
 
Vertica mpp columnar dbms
Vertica mpp columnar dbmsVertica mpp columnar dbms
Vertica mpp columnar dbms
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
 
References1. HCS 2010 online manuals.2. Data Data provi.docx
References1. HCS 2010 online manuals.2. Data  Data provi.docxReferences1. HCS 2010 online manuals.2. Data  Data provi.docx
References1. HCS 2010 online manuals.2. Data Data provi.docx
 
Patstat indicators step by step
Patstat indicators step by stepPatstat indicators step by step
Patstat indicators step by step
 

Más de UGIF

UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutes
UGIF
 
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
UGIF
 
Ugif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-updateUgif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-update
UGIF
 
Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011
UGIF
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
UGIF
 
Ugif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seineUgif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seine
UGIF
 
Ugif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_tsUgif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_ts
UGIF
 
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
UGIF
 
Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012
UGIF
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011
UGIF
 
Ugif 04 2011 informix notonlypointofsales-fr-001
Ugif 04 2011   informix notonlypointofsales-fr-001Ugif 04 2011   informix notonlypointofsales-fr-001
Ugif 04 2011 informix notonlypointofsales-fr-001
UGIF
 
Ugif 04 2011 informix fug-paris
Ugif 04 2011   informix fug-parisUgif 04 2011   informix fug-paris
Ugif 04 2011 informix fug-paris
UGIF
 
Ugif 04 2011 ibm informix genero offering v12
Ugif 04 2011   ibm informix genero offering v12Ugif 04 2011   ibm informix genero offering v12
Ugif 04 2011 ibm informix genero offering v12
UGIF
 
Ugif 04 2011 france ug04042011-jroy_ts
Ugif 04 2011   france ug04042011-jroy_tsUgif 04 2011   france ug04042011-jroy_ts
Ugif 04 2011 france ug04042011-jroy_ts
UGIF
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1
UGIF
 
Ugif 04 2011 france iiug 4 april - paris informix at ibm update ss
Ugif 04 2011   france iiug 4 april - paris  informix at ibm update ssUgif 04 2011   france iiug 4 april - paris  informix at ibm update ss
Ugif 04 2011 france iiug 4 april - paris informix at ibm update ss
UGIF
 

Más de UGIF (20)

UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
 
Ugif 09 2013
Ugif 09 2013Ugif 09 2013
Ugif 09 2013
 
Ugif 09 2013 psm
Ugif 09 2013   psmUgif 09 2013   psm
Ugif 09 2013 psm
 
Ugif 09 2013 friug 201309 axional web studio
Ugif 09 2013 friug 201309   axional web studioUgif 09 2013 friug 201309   axional web studio
Ugif 09 2013 friug 201309 axional web studio
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutes
 
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
 
Ugif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-updateUgif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-update
 
Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
 
Ugif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seineUgif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seine
 
Ugif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_tsUgif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_ts
 
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
 
Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011
 
Ugif 04 2011 informix notonlypointofsales-fr-001
Ugif 04 2011   informix notonlypointofsales-fr-001Ugif 04 2011   informix notonlypointofsales-fr-001
Ugif 04 2011 informix notonlypointofsales-fr-001
 
Ugif 04 2011 informix fug-paris
Ugif 04 2011   informix fug-parisUgif 04 2011   informix fug-paris
Ugif 04 2011 informix fug-paris
 
Ugif 04 2011 ibm informix genero offering v12
Ugif 04 2011   ibm informix genero offering v12Ugif 04 2011   ibm informix genero offering v12
Ugif 04 2011 ibm informix genero offering v12
 
Ugif 04 2011 france ug04042011-jroy_ts
Ugif 04 2011   france ug04042011-jroy_tsUgif 04 2011   france ug04042011-jroy_ts
Ugif 04 2011 france ug04042011-jroy_ts
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1
 
Ugif 04 2011 france iiug 4 april - paris informix at ibm update ss
Ugif 04 2011   france iiug 4 april - paris  informix at ibm update ssUgif 04 2011   france iiug 4 april - paris  informix at ibm update ss
Ugif 04 2011 france iiug 4 april - paris informix at ibm update ss
 

Ugif 10 2012 ppt0000002

  • 1. Update Statistics Olivier Bourdin olivier.bourdin@fr.ibm.com Mercredi 3 Octobre 2012 User Group Informix France
  • 2. Overview Brief Review and History What’s changed? – 11.10, 11.50 – 11.70 – “Smart Statistics” 11.70 FAQ’s – Do I need to do anything different? – Did the update statistics update any stats? – Update statistics and reoptimization User Group Informix France
  • 3. Why is statistics important? Choosing the right QUERY PATH determines how fast you get your results. Choosing the Wrong Path can be like going around the world to get to your neighbor’s. • Expensive to go around the world. • Takes too long. User Group Informix France
  • 4. Query Optimization Process Examine all tables (table A, table B, table C) – Examine selectivity of every filter (where clauses) – Determine if indexes can be used for filters, order by, group by – Find the best way to scan a table -- sequentially or by an index Identify Join Pairs (AB, AC, BA, BC, CA, CB) – Find best join method (nested loop, hash, or sort merge) – Decide which indexes are best for the join – Calculate the cost of the join Repeat for each additional table (ABC, ACB, BAC, ...) User Group Informix France
  • 5. Estimating costs: need data ! Find the cheapest/lowest cost path. – Cost = I/O cost + Weight * (CPU cost) – I/O -- disk access – CPU -- Rows processed Estimate costs – Filters -- Which indexes to use? – Joins -- Nested Loop, Hash, or Sort Merge? – Eliminate redundant pairs? User Group Informix France
  • 6. Filter selectivity Selectivity is the percentage of rows selected as a result of a filter (number between 0 and 1) Expression Filter Selectivity indexed_col = literal F=1/(number of distinct keys in index) value indexed_col > literal F = (literal value - 2nd min)/(2nd max-2nd value min) NOT expression F = 1 - F(expression) expr1 AND expr2 F = F(expr1) x F(expr2) User Group Informix France
  • 7. How do we influence Quey Optimization ? OPTCOMPIND Optimizer directives, Optimization Goals Update Statistics – Collect information for the optimizer – Table nrows, npused; Index Statistics -- LOW – Data Distributions -- MEDIUM & HIGH – Compile Stored Procedures User Group Informix France
  • 8. Where are the stats stored ? systables (Low) – nrows, npused sysindices (Low) – leaves, levels, nunique, clust syscolumns (Low) – colmin, colmax sysfragments (Low) – nrows, npused, – For index partitions, levels, clust Can view with sysdistrib (Medium or High) dbschema -hd User Group Informix France
  • 9. View Query Path Set explain on – Can be set in session Explain Directive – Can be embedded in the query FOREACH SELECT {+EXPLAIN } order_num INTO p_num FROM orders WHERE customer_num = 104 ORDER BY order_num xtrace Debug – Support may ask you to turn this on User Group Informix France
  • 10. Debugging with xtrace To “see” the statistics information being used for query optimization Example: xtrace heavy -c XTF_OPTMZR -f XTF_DEBUG xtrace size 10000 xtrace on Use “xtrace fview” or Use “xtrace info” to “xtrace view” to view display current xtrace traces. settings. “xtrace fview” includes Use “xtrace --” for xtrace timestamps. usage info. User Group Informix France
  • 11. Xtrace: example f1 31310 16 get_distrib(): distrib not found for table c col zipcode Before f1 7401 16 selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0434783 … … f2 1207 16 oprowspages(tab = c, nrows = 28, npages = 2) f2 13217 16 opmix_iscancost(numrows=1.21739,npages=2,pagesread=1.13988) f2 13225 16 opmix_iscancost(scancost=1.1764,indexcost=1.08, …, iscancost=2.2564) f1 31310 18 get_distrib(): distrib found for table c col zipcode After Update f1 7401 18 selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0357143 … Statistics … f2 1207 18 oprowspages(tab = c, nrows = 28672, npages = 2048) … f2 2237 18 dpages = 24576 lpages = 84 nlevels = 2 f2 1871 18 dcost = 33.72 seek 0 keyonly = TRUE f2 1896 18 iscancost(c, zip_ix) cost = 35.72 f2 13217 18 opmix_iscancost(numrows=1024,npages=2048,pagesread=805.977) f2 13225 18 opmix_iscancost(scancost=836.697,indexcost=35.72, …, iscancost=872.417) User Group Informix France
  • 12. Xtrace (after ... cont’d) … f2 1207 18 oprowspages(tab = c, nrows = 28672, npages = 2048) f2 1320 18 opscantabcost(c) npages = 2048, nrows = 28672, cost = 2909.16 f2 1527 18 opcartcost(c) cost = 2909.16 initcost = 0 f2 1988 18 index_info(): index 100_1 fullness 0.75 recs_per_node 128 keylen 4 … f2 2237 18 dpages = 2048 lpages = 187 nlevels = 3 f2 10863 18 idxtree_travcost s 3.48772e-05 nlevels 3 lpages .. dpages .. mempages 512 f2 14448 18 seek_factor 6 clust 2048 clust_scale 0 seek 0 … f2 1727 18 opidxcost(c, 100_1) = 0.745763 f1 16094 18 index 100_1 considered, icost 0.745763, istart 0.0078125, fltragg 0 f1 16324 18 indexp(): best index path: idx 100_1 icost = 0.745763 idx_flags 2 f3 3462 18 idx cost = 0.745763 initcost = 0.0078125 totalcost = 17.1526 f3 3465 18 outer size = 23 join size = 1 f3 8468 18 build inner table, init cost is 13.5745, join cost is 4.24268 f3 8568 18 build outer table, init cost is 4.24268, join cost is 13.5745 User Group Informix France
  • 13. sqexplain.out (before) select c.city, c.state, o.ship_date from customer c, orders o where c.customer_num = o.customer_num and c.state = ? and c.zipcode = ? Estimated Cost: 3 Estimated # of Rows Returned: 1 1) informix.c: INDEX PATH Filters: informix.c.state = 'AZ' (1) Index Name: informix.zip_ix Index Keys: zipcode (Serial, fragments: ALL) Lower Index Filter: informix.c.zipcode = '85016' 2) informix.o: INDEX PATH (1) Index Name: informix. 102_4 Index Keys: customer_num (Serial, fragments: ALL) Lower Index Filter: informix.c.customer_num = informix.o.customer_num NESTED LOOP JOIN User Group Informix France
  • 14. sqexplain.out (after) select c.city, c.state, o.ship_date from customer c, orders o where c.customer_num = o.customer_num and c.state = ? and c.zipcode = ? Estimated Cost: 19 Customer has 28672 rows. Estimated # of Rows Returned: 1 Orders has 23 rows. 1) informix.o: SEQUENTIAL SCAN 2) informix.c: INDEX PATH Filters: (informix.c.zipcode = '85016' AND informix.c.state = 'AZ' ) (1) Index Name: informix. 100_1 Index Keys: customer_num (Serial, fragments: ALL) Lower Index Filter: informix.c.customer_num = informix.o.customer_num NESTED LOOP JOIN User Group Informix France
  • 15. Before 11.x Before 11.x – Update statistics low, – Update statistics medium, high • Resolution, Confidence Scripts – Update statistics distributions only Cron jobs – Update statistics drop distributions – Update statistics for table, for procedure – Lots of guidelines • What to run update statistics on • Which update statistics to run • How to run update statistics User Group Informix France
  • 16. Guidelines Update statistics medium distributions only for all columns that do not have an index Update statistics high for columns that are the first key in an index Update statistics low for all columns in multicolumn indexes Run with PDQ for better performance (for table ONLY) Do not run with PDQ for update statistics for procedure User Group Informix France
  • 17. Issues (before 11.x) Difficult to know when update statistics was run last Guidelines weren’t always well-understood People weren’t sure how to run update statistics – Accidentally over-wrote statistics by running HIGH first, then MEDIUM – Accidentally compiled stored procedures with PDQ – Ran Update Stats LOW twice (performance issue) Update statistics LOW for table tab1; What might be considered Update statistics HIGH for table tab1 (col1, col2); “missing” here? User Group Informix France
  • 18. 11.10 Features 11.10 Enhancements – Create index creates initial stats and distribution information for the leading column of the index – Enhance catalog information • What time was update statistics Low run? • What time were the distributions created? • How many rows were sampled for the distributions? – New “Sampling Size” option – Update statistics drop distributions ONLY – Auto Update Statistics Scheduler tasks User Group Informix France
  • 19. Help with Guidelines Use scheduler task “Auto Update Statistics Evaluation” – Scheduler task can be run “on-demand” using exectask() Execute function exectask(‘Auto Update Statistics Evaluation’) Use script in Informix Technote (swg21137764) – UPDATE STATISTICS commands to allow the optimizer to work its best http://www-01.ibm.com/support/docview.wss?uid=swg21137764 Use Art Kagel’s dostats (from IIUG) User Group Informix France
  • 20. US History First introduced in 11.10 – Scheduler task “Auto Update Statistics Evaluation” – Scheduler task “Auto Update Statistics Refresh” – Uses the guidelines to determine the update statistics commands to run Enhancement to work with non-English Locales in 11.50.xC6 User Group Informix France
  • 21. AUS Scheduler Tasks Runs Update Statistics FOR TABLE commands UPDATE STATISTICS LOW FOR TABLE stores7:customer UPDATE STATISTICS HIGH FOR TABLE stores7:customer ( customer_num, zipcode ) RESOLUTION 0.500 DISTRIBUTIONS ONLY Runs with PDQ set to AUS_PDQ in sysadmin:ph_threshold > select * from ph_threshold where name = "AUS_PDQ"; id 30 name AUS_PDQ task_name Auto Update Statistics Refresh value 10 value_type NUMERIC description Update statistics executes with this PDQ priority. User Group Informix France
  • 22. AUS Parameters AUS_AGE aus_evaluator The statistics are rebuilt after specified days. AUS_CHANGE aus_evaluator The statistics are rebuilt after specified percentage of data has changed. AUS_AUTO_RULES aus_evaluator 1 or 0 – if “off”, only evaluates tables that already have statistics. AUS_SMALL_TABLES aus_evaluator Tables containing less than this number of rows will always have their statistics rebuilt. AUS_PDQ aus_refresh_stats Run Update Statistics with this PDQ setting. User Group Informix France
  • 23. 11.70 Features Smart Statistics – Default: AUTO_STAT_MODE 1 – Default: STATCHANGE 10 – Update Statistics command, when run, is not executed for index statistics and for table distribution if the STATCHANGE threshold has not been met Fragment-level Statistics – Not on by default – Not discussed in this presentation User Group Informix France
  • 24. 11.70 Statistics Updated ? Update Statistics info in database catalog tables –Look at ustlowts in systables • Updated when systables' nrows and npused are updated – this is done whenever update statistics command is run – STATCHANGE threshold is not looked at –Look at ustlowts in sysindices • Updated when index statistics are rebuilt/updated –Look at constr_time in sysdistrib • Updated when distribution statistics are rebuilt/updated User Group Informix France
  • 25. Example $ dbaccessdemo7 stores7 –nots select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts, ustlowts from sysindices where tabid = 100 and idxname = “zip_ix” ; idxname zip_ix Index on customer(zipcode) levels 1 leaves 1.000000000000 nrows 28.00000000000 UDI counters for this index nupdates 0.00 at the time of the update ndeletes 0.00 statistics low run. ninserts 28.00000000000 ustlowts 2012-04-03 22:54:56.00000 > select * from sysdistrib where tabid = 100; dbaccessdemo7 did not create table distributions No rows found. for customer table. User Group Informix France
  • 26. Example (cont’d) > load from customer.unl insert into customer; 199863 row(s) loaded. > select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts, > ustlowts from sysindices where tabid = 100 and idxname = “zip_ix”; idxname zip_ix levels 1 Index statistics for zip_ix leaves 1.000000000000 unchanged after 199,863 nrows 28.00000000000 rows inserted into the nupdates 0.00 customer table. ndeletes 0.00 ninserts 28.00000000000 -- No update statistics ustlowts command has been run. 2012-04-03 22:54:56.00000 User Group Informix France
  • 27. Example (cont’d) > create index state_ix on customer(state); idxname zip_ix idxname state_ix levels 1 levels 3 leaves 1.000000000000 leaves 556.0000000000 nrows 28.00000000000 nrows nupdates 0.00 nupdates 0.00 ndeletes 0.00 ndeletes 0.00 ninserts 28.00000000000 ninserts 0.00 ustlowts 2012-04-03 ustlowts 2012-04-03 22:54:56.00000 23:04:33.00000 After inserting 199,863 rows into the customer table, create index state_ix on customer(state). -- No update statistics command has been run. User Group Informix France
  • 28. Example (cont’d) > select tabid, colno, mode, smplsize, rowssmpld, constr_time, > ustnrows, ustbuildduration, nupdates, ndeletes, ninserts > from sysdistrib where tabid = 100; tabid 100 colno 8 column state mode H smplsize 199891.0000000 rowssmpld 199891.0000000 constr_time 2012-04-03 23:04:33.00000 ustnrows 199891.0000000 ustbuildduration 0:00:00.00000 Distribution nupdates 0.00 information for ndeletes 0.00 column state in ninserts 199891.0000000 customer table User Group Informix France
  • 29. Example (cont’d) > select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr > where partnum in (select partn from sysfragments > where fragtype = "I" and indexname in ('state_ix', 'zip_ix')); partnum nupdates ndeletes ninserts zip_ix 1049092 0 0 199891 state_ix 1049100 0 0 0 > select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr > where partnum = (select partnum from systables where tabid = 100); partnum nupdates ndeletes ninserts customer 1049069 0 0 199891 Actual partition page info, showing the UDI counters for the partition, since the partition was created – this is not the same as the UDI info in the catalogs, which are updated when statistics are updated. User Group Informix France
  • 30. OAT view of Statistics User Group Informix France
  • 31. OAT view (cont’d) For customer table -- • Index zip_ix has exceeded STATCHANGE. • Index state_ix has not. User Group Informix France
  • 32. Example (cont’d) > update statistics low for table customer; idxname zip_ix BEFORE idxname zip_ix AFTER levels 1 levels 3 leaves 1.000000000000 leaves 505.0000000000 nrows 28.00000000000 nrows 199891.0000000 nupdates 0.00 nupdates 0.00 ndeletes 0.00 ndeletes 0.00 ninserts 28.00000000000 ninserts 199891.0000000 ustlowts 2012-04-03 ustlowts 2012-04-04 22:54:56.00000 00:36:53.00000 • Index statistics updated. zip_ix index • Catalog UDI values updated. • sysindices ustlowts updated. User Group Informix France
  • 33. Example (cont’d) > update statistics low for table customer; BEFORE AFTER idxname state_ix idxname state_ix levels 3 levels 3 leaves 556.0000000000 leaves 556.0000000000 nrows nrows 199891.0000000 nupdates 0.00 nupdates 0.00 ndeletes 0.00 ndeletes 0.00 ninserts 0.00 ninserts 0.00 ustlowts 2012-04-03 ustlowts 2012-04-03 23:04:33.00000 23:04:33.00000 • Index statistics unchanged. state_ix index • Catalog UDI values unchanged. • sysindices ustlowts unchanged. User Group Informix France
  • 34. Example (cont’d) > select tabname, tabid, nrows, created, ustlowts > from systables where tabid = 100; tabname customer tabid 100 nrows 199891.0000000 created 04/03/2012 ustlowts 2012-04-04 00:36:53.00000 The systables information is always updated when update statistics for table stats are run, regardless of STATCHANGE. User Group Informix France
  • 35. Example Update Statistics LOW for table tab1; Update Statistics HIGH for table tab1 (col1, col2); Before 11.70 – You should put “Distributions Only” in the Update Statistics HIGH command to avoid collecting index statistics again After 11.70 – Doesn’t matter since index statistics will only be updated if STATCHANGE has been met for the index User Group Informix France
  • 36. Sysmaster query for %change SELECT colname as name, 'Column' as type, constr_time::datetime year to second as build_date, rowssmpld::bigint as sample, d.ustnrows::bigint as nrows, case when d.mode = 'M' then 'Medium‘ when d.mode = 'H' then 'High' end as mode, resolution, confidence, ustbuildduration as build_duration, (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) as udi_counter, CASE WHEN d.ustnrows=0 and (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) = 0 THEN 0.00 WHEN d.ustnrows=0 and (table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) != 0 THEN -1 ELSE ROUND((table_counter.udi_counter - d.ninserts - d.nupdates – d.ndeletes)/d.ustnrows * 100,2) END as change FROM sysdistrib d, syscolumns c, ( select SUM(nupdates + ndeletes + ninserts) as udi_counter from sysmaster:sysptnhdr where partnum in (select partn from sysfragments where tabid = 100 and fragtype='T' union select partnum as partn from systables where tabid = 100) ) as table_counter WHERE d.tabid=100 and c.tabid=100 and d.colno = c.colno and d.seqno = 1 UNION User Group Informix France
  • 37. Sysmaster query for %change -- Continuing query started on previous slide SELECT idxname as name, MIN('Index') as type, MIN(ustlowts)::datetime year to second as build_date, MIN(0) as sample, SUM(f.nrows)::bigint as nrows, MIN('Low') as mode, MIN(0) as resolution, MIN(0) as confidence, SUM(i.ustbuildduration) as build_duration, SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)) as udi_counter, CASE WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) = 0 THEN 0.00 WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) != 0 THEN -1 ELSE ROUND((SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)))/SUM(f.nrows) * 100,2) END as change FROM sysindices i, sysmaster:sysptnhdr p, sysfragments f WHERE i.idxname = f.indexname AND i.tabid = 100 AND i.tabid = f.tabid AND f.partn = p.partnum GROUP BY i.idxname ORDER BY change DESC User Group Informix France
  • 38. Table STATCHANGE value Default STATCHANGE applies if not set for table Can be set at session level using set environment – Set environment statchange ‘5’ ; Can set STATCHANGE when creating table Can alter table to set STATCHANGE – Alter table customer statchange 5; select tabname, NVL ( statchange, (select cf_effective from sysmaster:sysconfig where cf_name = ‘STATCHANGE’) ) as statchange from systables where tabname = "customer"; User Group Informix France
  • 39. FORCE option Can add “FORCE” to any update statistics command to ignore STATCHANGE When you upgrade to 11.70 – Existing partition pages will have UDI counters added (UDI values are 0) – Catalog tables sysfragments (for indexes) and sysdistrib (for table column data distributions) will have UDI counters added (values are 0) – What does this mean for Update Statistics? • FORCE Execute even if NO change • STATCHANGE 0 Execute if any amount of change (non- zero) User Group Informix France
  • 40. FORCE option (cont’d) Add “FORCE” to end of update statistics command to get legacy behavior (ignore STATCHANGE) FORCE – Execute even if NO change – Sets sysdistrib nupdates, ndeletes, ninserts to 0 – same behavior isn’t seen with sysfragments nupdates, ndeletes, ninserts STATCHANGE 0 – Execute if non-zero amount of change – Set environment STATCHANGE ‘0’ User Group Informix France
  • 41. Stored Procedures Not affected by STATCHANGE -- Update statistics FOR PROCEDURE SQL statements in SPL are optimized – When SPL is created or on first execution – When dependent table or indexes are altered – When statistics of dependent tables change In 11.70, this means every time update statistics is run to update a table, systable’s npused, nrows, and ustlowts are updated (even if index statistics or distribution statistics are not updated due to STATCHANGE not having been met). User Group Informix France
  • 42. Update Statistics Low - Summary Update statistics low performance improvement feature takes effect when : • USTLOW_SAMPLE is set to 1 • the index has 100,000 or more leaf pages • Detached index USTLOW_SAMPLE • New ONCONFIG parameter, documented in 11.70.xC4 • Controls use of sampling (new feature) to collect index statistics during update statistics • 0 or 1 (on) / Default value is 0 (off) • Can be updated with onmode -wm/wf • Can be set at session-level using SET ENVIRONMENT – Set Environment USTLOW_SAMPLE '0' / '1' / 'on' / 'off' User Group Informix France
  • 43. Update Statistics Low – Why? Update Statistics LOW takes too long when gathering statistics for large indexes • Entire index is read in sequence • Each leaf page of an index must be read individually (separate I/O) • Some customers do not run the command because it does not fit in the maintenance window • On a single large table (billions of rows and many indexes), command can take over 3 days New Feature Solution: USTLOW_SAMPLE • Use sampling to reduce time required to gather index statistics • Many samples are taken, and index statistics is calculated based on statistics from the samples User Group Informix France
  • 44. Update Statistics Low - Details Update statistics low gathers the following index statistics • number of index levels • number of index leaf pages • number of unique values for index lead key • clustering factor • 2nd lowest and 2nd highest value for index lead key Index statistics saved in database catalog • Sysindices (levels, leaves, nunique, clust) • Syscolumns (colmin, colmax) • Sysfragments (levels, clust) for fragtype = “I” When Update Statistics Med or High is run, index statistics are also collected, unless “Distributions Only” is used User Group Informix France
  • 45. Update Statistics Low – Details (cont’d) Instead of reading the entire index in sequence, the new feature: • Uses sampling • Each sample will go from index root page to index leaf page, reading one or more index leaf pages • Sampling is “dynamic” -- number of samples is not pre- determined • Number of samples is determined by the quality of the samples – Fewer samples needed if data is evenly distributed – More samples needed if data distribution is skewed – Standard deviation among the samples is used as criteria as a measurement of “quality” • Time for update statistics is not predictable up-front User Group Informix France
  • 46. Update Statistics Low - Example Example based on internal traces User Group Informix France
  • 47. Update Statistics Low - Example Example based on internal traces User Group Informix France
  • 48. Update Statistics Low - Notes Review of Update statistics feature – 11.70.xC1 “Smart Statistics” Feature Review • Default: AUTO_STAT_MODE 1 • Default: STATCHANGE 10 • Update Statistics command, when run, is not executed for index statistics and for table distribution if the STATCHANGE threshold has not been met – Update Statistics info in database catalog tables • Look at ustlowts in systables – Updated when systables' nrows and npused are updated – this is done whenever update statistics command is run – STATCHANGE threshold is not looked at • Look at ustlowts in sysindices – Updated when index statistics are rebuilt/updated • Look at constr_time in sysdistrib – Updated when distribution statistics are rebuilt/updated Remember, 11.10 Feature – Statistics are collected when Index is created User Group Informix France
  • 49. Catalog for smarter Statistics systables sysfragments 11.70 statchange nupdates Existing statlevel ndeletes ustlowts ninserts sysindices sysdistrib sysfragdist nupdates nupdates nupdates ndeletes ndeletes ndeletes ninserts ninserts ninserts ustbuildduration ustbuildduration ustbuildduration ustlowts constr_time constr_time User Group Informix France
  • 50. Questions ? User Group Informix France
  • 51. Merci Olivier Bourdin olivier.bourdin@fr.ibm.com Mercredi 3 Octobre 2012 User Group Informix France