So you are young and single – you are standing in a crowded room and you are looking to connect with all the people that match the criteria of your perfect partner. How do you identify and grab the details of those individuals as quickly as possible, without spending too much time talking to people you will never get on with. This presentation will use a situation we have all been familiar with to explain how the optimizer works, the steps in the optimisation process and some of the rules that help it to score a match as efficiently as possible.
3. SAGE Computing Services
Customised Oracle Training Workshops and Consulting
Penny Cookson
Managing Director and Principal Consultant
Working with Oracle products since 1987
Oracle Magazine Educator of the Year 2004
www.sagecomputing.com.au
penny@sagecomputing.com.au
5. COL1 = :B1
AND COL2 >= :B2 AND COL2 < :B3
AND COL3 >= :B4
AND COL4 = :B5
…………………………………….
Col1 Col2 Col3 Col4 Col5 Col6
How many of these
are there likely to
be?
How many rows will be returned?
There are 23 million rows in this table
6. ATTRIBUTE1 = :B1
AND ATTRIBUTE2 >= :B2
AND ATTRIBUTE2 < :B3
AND ATTRIBUTE3 >= :B4
AND ATTRIBUTE4 = :B5
How many
of these are
there likely
to be?
Looking for your perfect match
There are 11 million males
in Australia
7. MARRIED = ‘N’
AND AGE >=25 AND AGE <30
AND HEIGHT >= 6ft 2 in
AND JOB=‘DBA’
How many
of these are
there likely
to be?
There are 11 million males
in Australia
8. The attribute Married
has two distinct
values Yes or No
50% 50%
We assume 50% of
each
How many people
satisfy the criteria
Married = ‘N’?
There are 11 million
males in Australia
Number of Unmarried males is
11,000,000/2 = 5,500,000
10. More Statistics
6 in every 10 males are
married
So - Number of Unmarried males
is 4,400,000
11. This is what we did originally
begin
dbms_stats.gather_schema_stats
(ownname=>'AUSOUG',
method_opt => 'for all columns size 1' );
end;
How can Oracle get better statistics?
15. SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*)
FROM men
WHERE married = 'N‘;
SELECT dbms_xplan.display_cursor
('5zyycq4y22j9g',format=>'ALLSTATS LAST')
FROM dual;
Now it gets it right
16. MARRIED = ‘N’
AND AGE >=25 AND AGE <30
AND HEIGHT >= 6ft 2 in
AND JOB=‘DBA’
How many
of these are
there likely
to be?
There are 11 million males
in Australia
MARRIED = ‘N’ 40%
20. create or replace function raw_to_num(i_raw raw)
return number as m_n number;
begin
dbms_stats.convert_raw_value(i_raw,m_n);
return m_n;
end;
/
create or replace function raw_to_date(i_raw raw)
return date as m_n date;
begin
dbms_stats.convert_raw_value(i_raw,m_n);
return m_n;
end;
/
create or replace function raw_to_varchar2(i_raw raw)
return varchar2 as m_n varchar2(20);
begin
dbms_stats.convert_raw_value(i_raw,m_n);
return m_n;
end;
/
http://jonathanlewis.wordpress.com/2006/11/29/low_value-high_value/
26. SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*)
FROM men
WHERE age >=25 and age < 30;
SELECT
dbms_xplan.display_cursor('6aub3fn962277',format=>'ALLSTATS LAST')
FROM dual
Not perfect but better
27. MARRIED = ‘N’
AND AGE >=25 AND AGE <30
AND HEIGHT >= 6ft 2 in (188cm)
AND JOB=‘DBA’
How many
of these are
there likely
to be?
There are 11 million males
in Australia
MARRIED = ‘N’ 40%
AND AGE >=25 AND <30 8.2%
28. SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*)
FROM men
WHERE height >= 188
SELECT dbms_xplan.display_cursor('bf4wbtgb9zptq',
format=>'ALLSTATS LAST')
FROM dual
How many men are >= 188cm (we have gathered a histogram)
29. MARRIED = ‘N’
AND AGE >=25 AND AGE <30
AND HEIGHT >= 6ft 2 in (188cm)
AND JOB=‘DBA’
How many
of these are
there likely
to be?
There are 11 million males
in Australia
MARRIED = ‘N’ 40%
AND AGE >=25 AND <30 8.2%
AND HEIGHT >= 6ft 2 in (188cm) 2.9%
30. 11000000*40/100 *8.2/100 * 2.9/100 =10,463
MARRIED = ‘N’ 40%
AND AGE >=25 AND <30 8.2%
AND HEIGHT >= 6ft 2 in (188cm) 2.9%
SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*)
FROM men
WHERE married = 'N'
AND age >=25 AND age < 30
AND height >= 188
33. 5.3% of males are >= 6ft 2inches
Are these
statistics out of
date?
- do select of
last_analyzed
34. 5.3% of males are >= 6ft 2inches
Note the
correlation
between
gender, age and
height
35. SELECT c.column_name, t.num_rows "Number of Rows",
c.num_distinct "Distinct Values",
c.histogram "Histogram"
FROM user_tables t, user_tab_col_statistics c
WHERE t.table_name = c.table_name
AND t.table_name = 'MEN'
39. SELECT c.column_name, t.num_rows "Number of Rows",
c.num_distinct "Distinct Values",
c.histogram "Histogram"
FROM user_tables t, user_tab_col_statistics c
WHERE t.table_name = c.table_name
AND t.table_name = 'MEN'
43. SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*)
FROM men
WHERE married = 'N'
AND age >=25 AND age < 30
AND height >= 188
When we combine range checks it gets it wrong
44. 10053 trace file – its not looking at the extended stats
ALTER SESSION SET EVENTS ' 10053 trace name context forever ‘;
EXPLAIN PLAN FOR …………;
ALTER SESSION SET EVENTS ' 10053 trace name context off ‘;
50. begin
dbms_spd.flush_sql_plan_directive;
end;
Clear any existing SQL Plan Directives
SELECT d.directive_id, d.type, d.state, d.reason, d.created,
o.object_name, o.subobject_name, o.notes, o.owner
FROM dba_sql_plan_directives d, dba_sql_plan_dir_objects o
WHERE o.directive_id = d.directive_id
AND o.owner NOT IN ('XDB','SYS','SYSTEM')
ORDER BY directive_id desc;
begin
dbms_spd.drop_sql_plan_directive(3579438123315094543);
end;
51. ALTER SYSTEM FLUSH shared_pool;
ALTER SESSION SET statistics_level = ALL;
Clear any existing plans and gather plan
statistics
55. UNMARRIED = ‘Y’
AND AGE >=25 AND AGE <30
AND HEIGHT >= 6ft 2 in (188cm)
AND JOB=‘DBA’
How many
of these are
there likely
to be?
There are 11 million males
in Australia
UNMARRIED = ‘Y’ 40%
AND AGE >=25 AND <30 8.2%
AND HEIGHT >= 6ft 2 in (188cm) 2.9%
We have no idea what percentage of males are DBAs
57. Now we know how many matches to expect
what is the best way to get to them
58. Two main types:
Pretty cruisy really – almost anyone will do
Really very picky – he must be just right
59.
60. The Oracle approach to
Pretty cruisy really – almost anyone will do
SELECT COUNT(*)
FROM men
WHERE age >=20 AND age < 50
AND married = 'N'
AND job != 'LAWYER'
74. SELECT i.index_name, i.distinct_keys, i.num_rows,
i.clustering_factor, t.blocks
FROM user_indexes i, user_tables t
WHERE t.table_name = i.table_name
AND t.table_name = 'MEN';
75.
76.
77. CREATE TABLE men2
AS SELECT * FROM men
ORDER BY IQ;
CREATE INDEX MEN2_IQ_N4 ON men2(iq);
begin
dbms_stats.gather_table_stats(ownname=>'AUSOUG',
tabname=>'MEN2', estimate_percent => null,
cascade=>true,
method_opt =>
'for all columns size 1 for columns size 2000 iq ' );
end;
86. Sorting
With B*Tree indexes
SELECT id, surname, firstname
FROM men
WHERE married = 'N'
AND age_range = '25-29‘
AND height = 188
AND iq = 120
AND job= 'DBA';
87. access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
Identify each join path
How many rows am I likely to get?
What is the best join method?
All I have talked about so far is Access to one table
– how does it JOIN them together ?
88. access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
1
2
3
89. access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
1
2 3
90. access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
access conditions
number of rows?
access method?
1
2 3
for each join path
for each join – what is the best JOIN METHOD?
91. Joins methods – Nested Loop with index
1 2
A.COL1 = B.COL2
A B
COL1 = 1
B.COL2
index
ACCESS
ROWS
WHERE
COL2 = 1
COL2 = 1
COL2 = 1
COL2 = 1
98. 0 rows
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ = 5;
SELECT COUNT(*)
FROM men
WHERE married = 'N'
AND age < 12
AND height >= 188
This is even worse
99. SELECT /*+ INDEX_COMBINE(MEN MEN_AGE_N1,MEN_HEIGHT_N1) */ COUNT(*)
FROM men
WHERE married = 'N'
AND age < 12
AND height >= 188
0 rowsBetter
100. SELECT COUNT(surname)
FROM men
WHERE height between 159.4 AND 160
26188 rows
SELECT /*+ FULL(MEN) */ COUNT(surname)
FROM men
WHERE height between 159.4 AND 160
101. 10,299,997 rowsSELECT COUNT(surname), COUNT( start_date)
FROM men m, events e
WHERE e.men_id (+) = m.id
AND m.iq = 156
SELECT /*+ USE_HASH(M,E) */ COUNT(surname), COUNT( start_date)
FROM men m, events e
WHERE e.men_id (+) = m.id
AND m.iq = 156
WE KNOW – EXPERIENCE/RESEARCH
ORACLE KNOWS?
DEV / DBA GAP
NUMBER OF OCCURENCES
CUMMULATIVE
15 BYTES -> PAD -> TREAT AS HEX -> CONV TO DEC -> ROUND -> ENDPOINT VALUE
Value:
Take the first 15 bytes of the string (after padding the string with nulls (for varchar2) or spaces (for char))
Treat the result as a hexadecimal number, and convert to decimal
Round to 15 significant digits and store as the endpoint_value
If duplicate rows appear, store the first 32 characters (increased to 64 for 12c) of each string as the endpoint_actual_value
WHAT ORACLE THINKS – IS IT RIGHT?
REVERSE
Take the first 15 bytes of the string (after padding the string with nulls (for varchar2) or spaces (for char))
Treat the result as a hexadecimal number, and convert to decimal
Round to 15 significant digits and store as the endpoint_value
If duplicate rows appear, store the first 32 characters (increased to 64 for 12c) of each string as the endpoint_actual_value
to_char(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') – -> ‘X is hexadecimal format
GATHER PLAN STATISTICS
ALTER SESSION SET STATISTICS_LEVEL = ALL
HYBRID v 12
SQUEEZE BUCKETS
Hybrid – new in v12 (also 2048 buckets not 254
To handle “almost popular” values
Endpoint for each “popular value” is repeat count
SOME OF 1s WOULD NOT HAVE MADE IT IN HEIGHT-BALANCED
TALL 6 YEAR OLDS
- DON’T ASSUME OPTIMIZER KNOWS WHAT WE KNOW
EXTENDED STATS
11
HYBRID
TOO MANY FOR FREQUECY
COMMON FOR EXTENDED - COMBINATIONS
MAJORITIES GOOD
MINORITIES NOT
10053 TRACE
DEFAULT DENSITY
MAJORITY
EXTENDED STATS NOT USER
Not used stats on 3 columns
Create virtual column – cannot use in extended statistics - well ys you can if you create an index first (from 11.2.0.3) or persist it – but didn’t seem to use anyway. (for 3 ?how about 2)
begin
dbms_stats.gather_table_stats(ownname=>'AUSOUG',
tabname=>'MEN',
method_opt =>
'for all columns size 1 for columns size auto married height, for columns size 2000 age age_range (age_range, married) ' );
end;
Since it doesn’t help I have removed the stats on this
WTF?????
11g
12 MORE + PERSISTENT
SECOND TIME
IS_REOPTIMIZABLE
NEW CHILD CURSOR
ORACLE FIX IT?
CLEAN UP
MORE CLEAN UP
SAME SRESULTS
NO CLUE
ASK
SAMPLE
SUMMARY:
GOOD STATS
HISTOGRAMS
EXTENDED STATS
CARD FEEDBACK
DYNAMIC
BACK TO ANALOGY
LAWYER
READ THEM ALL
FILTER OUT THE ONES WE WANT
GO TO SOURCE WHICH HELPS LOCATE THEM
SELECTION CRITERIA
MOBILES
GO TO SOURCE THAT LOCATES ROWS THAT MATCH
MODIFIED COLUMNS _ LYING
12c BATCHED
/*+ NO_BATCH_TABLE_ACCESS_BY_ROWID(i)*/
alter session set "_optimizer_batch_table_access_by_rowid"=FALSE;
access rows in block order to improve the clustering and reduce the number of times that the database must access a block.
Cannot use if using index for sort
?whole presentation on this
?effect on clustering factor
HOW DECIDE
CUT OFF?
SERIOUSLY TALL DEVELOPER
DEVELOPER PLAYING FOR WILDCATS - STRETCHING
IQ AS EXAMPLE
10043 TRACE
BIT LESS - INDEX
SINGLE BLOCK READ
CREATE WELL CLUSTERED
NOT SUGGESTING REORDER ALL
SUMMARY – FULL / INDEXES
YOUR CONTROL
READ ONLY
EQUALITY
STAND UP
- DBA
- SIT NOT 25-29
- SIT HEIGHT < 6FT?
EQUALITIES
NOT RANGE
HOW PUTS TOGETHER
SEE IN 10053
NUMBER OF PATHS
N FACTORIAL
NOT ALL
HOW
SAME STORY – HOW MANY
TRANSACTIONAL
BUT NOT ONE TO ONE
HASH JOIN
MORE ROWS – WORTH FULL SCAN
WHERE?
SUMMARY
Data dependency issue
Minority values
Does not use histogram on events for join
Iq 156 = record 2 who has