When you execute a business task on a computer system, you create an experience. The duration of this experience is called response time. The richest and easiest information about response time to obtain in the whole Oracle technology stack is available from the Oracle Database tier: Oracle's extended SQL trace data. But in almost 100% of first tries with using trace data, people make a data collection mistake that complicates their analysis. This is the story of that mistake.
9. @CaryMillsap
A pe ormance analyst looks at your time
consumptions to determine whether it is possible
to reduce the response time of the experience.
…And, if so, then by how much.
9
10. @CaryMillsap
The richest and easiest diagnostic information to
obtain in this whole technology stack is available
from the Oracle Database tier.
…Oracle’s extended SQL trace data.
10
11. @CaryMillsap
But in almost 100% of first tries with using Oracle
extended SQL trace data, people make a data
collection mistake that complicates their analysis.
11
21. @CaryMillsap
Slow depa ment?
Then analyze the steam of experiences.
Verdict: clerk too slow between experiences.
21
=
CALL-NAME DURATION % CALLS MEAN
--------------------------- -------- ------ ----- ---------
SQL*Net message from client 137 87.3% 7 19.571429
everything else 20 12.7% 142 0.140845
--------------------------- -------- ------ ----- ---------
TOTAL (2) 157 100.0% 149 1.053691
22. @CaryMillsap
THE ONE YOU’LL BE DOING
MOST OF THE TIME.
Slow application?
Then analyze each experience separately.
Verdict: app is too cha y.
22
CALL-NAME DURATION % CALLS MEAN
--------------------------- -------- ------ ----- --------
SQL*Net message from client 8 57.1% 4 2.000000
some of this, some of that 6 42.9% 28 0.214286
--------------------------- -------- ------ ----- --------
TOTAL (2) 14 100.0% 32 0.382550
CALL-NAME DURATION % CALLS MEAN
--------------------------- -------- ------ ----- --------
SQL*Net message from client 11 52.3% 4 2.750000
some of this, some of that 10 47.7% 113 0.088496
--------------------------- -------- ------ ----- --------
TOTAL (1) 21 100.0% 117 0.382550
=
25. @CaryMillsap
Typical Oracle trace file for a connection pool
25
...
WAIT ... nam='SQL*Net message from client' ela= 1202689 ...
A sequence of trace lines explaining time consumption for Experience A
WAIT ... nam='SQL*Net message from client' ela= 4260917 ...
A sequence of trace lines explaining time consumption for Experience B
WAIT ... nam='SQL*Net message from client' ela= 5213365 ...
A sequence of trace lines explaining time consumption for Experience C
WAIT ... nam='SQL*Net message from client' ela= 2044420 ...
...
29. @CaryMillsap
These experiences like A, B, and C can have
SQL*Net message from client calls in them, too.
…That might dominate response times!
29
30. @CaryMillsap
WAIT ... nam='SQL*Net message from client' ela= 1202689 ...
stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 342
more stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 1492
yet more stuff for experience A
etc.
WAIT ... nam='SQL*Net message from client' ela= 4260917 ...
stuff for Experience B
WAIT ... nam='SQL*Net message from client' ela= 2928
more stuff for Experience B
etc.
WAIT ... nam='SQL*Net message from client' ela= 5213365 ...
stuff for Experience C
WAIT ... nam='SQL*Net message from client' ela= 855
more stuff for Experience C
etc.
WAIT ... nam='SQL*Net message from client' ela= 2044420 ...
30
WAIT ... nam='SQL*Net message from client' ela= 1202689 ...
stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 342
more stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 1492
yet more stuff for experience A
etc.
WAIT ... nam='SQL*Net message from client' ela= 4260917 ...
stuff for Experience B
WAIT ... nam='SQL*Net message from client' ela= 2928
more stuff for Experience B
etc.
WAIT ... nam='SQL*Net message from client' ela= 5213365 ...
stuff for Experience C
WAIT ... nam='SQL*Net message from client' ela= 855
more stuff for Experience C
etc.
WAIT ... nam='SQL*Net message from client' ela= 2044420 ...
31. @CaryMillsap
It’s actually a common pa ern.
Behold the network abusing, cha y app…
31
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- ---------- ------ ------- -------- -------- --------
SQL*Net message from client 200.939935 99.5% 142,520 0.001410 0.000937 0.202835
SQL*Net message to client 0.526257 0.3% 142,520 0.000004 0.000000 0.000130
FETCH 0.439933 0.2% 142,518 0.000003 0.000000 0.001000
PARSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
EXEC 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- ---------- ------ ------- -------- -------- --------
TOTAL (5) 201.906125 100.0% 427,562 0.000472 0.000000 0.202835
32. @CaryMillsap
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- ---------- ------ ------- -------- -------- --------
SQL*Net message from client 0.911041 36.5% 72 0.012653 0.000890 0.026857
SQL*Net more data to client 0.841897 33.7% 2,688 0.000313 0.000004 0.013287
FETCH 0.744885 29.8% 70 0.010641 0.006999 0.012998
PARSE 0.001000 0.0% 2 0.000500 0.000000 0.001000
SQL*Net message to client 0.000147 0.0% 72 0.000002 0.000001 0.000006
EXEC 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- ---------- ------ ------- -------- -------- --------
TOTAL (6) 2.498970 100.0% 2,906 0.000860 0.000000 0.026857
It’s actually a common pa ern.
…and the way it should behave.
32
37. @CaryMillsap
So can a trace file.
WAIT ... nam='SQL*Net message from client' ela= 1202689 ...
stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 342
more stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 1492
yet more stuff for experience A
etc.
WAIT ... nam='SQL*Net message from client' ela= 4260917 ...
stuff for Experience B
WAIT ... nam='SQL*Net message from client' ela= 2928
more stuff for Experience B
etc.
WAIT ... nam='SQL*Net message from client' ela= 5213365 ...
stuff for Experience C
WAIT ... nam='SQL*Net message from client' ela= 855
more stuff for Experience C
etc.
WAIT ... nam='SQL*Net message from client' ela= 2044420 ...
37
40. @CaryMillsap
Trace file with oceans.
Find the 2.3-sec
experience.
40
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
SQL*Net message from client 31.018640 99.3% 10,003 0.003101 0.000023 20.121507
direct path read 0.110575 0.4% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 0.3% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 0.0% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 0.0% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.0% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (7) 31.225011 100.0% 35,013 0.000892 0.000000 20.121507
What percentage of this 2.3-sec experience is rivers?
41. @CaryMillsap 41
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
SQL*Net message from client 31.018640 99.3% 10,003 0.003101 0.000023 20.121507
direct path read 0.110575 0.4% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 0.3% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 0.0% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 0.0% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.0% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (7) 31.225011 100.0% 35,013 0.000892 0.000000 20.121507
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
direct path read 0.110575 53.6% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 39.7% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 4.3% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 1.9% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.5% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (6) 0.206371 100.0% 25,010 0.000008 0.000000 0.020533
Trace file with no
water at all.
Doesn’t explain the
2.3-sec experience.
What percentage of this 2.3-sec experience is rivers?
Trace file with oceans.
Find the 2.3-sec
experience.
42. @CaryMillsap 42
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
SQL*Net message from client 31.018640 99.3% 10,003 0.003101 0.000023 20.121507
direct path read 0.110575 0.4% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 0.3% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 0.0% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 0.0% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.0% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (7) 31.225011 100.0% 35,013 0.000892 0.000000 20.121507
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
direct path read 0.110575 53.6% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 39.7% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 4.3% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 1.9% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.5% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (6) 0.206371 100.0% 25,010 0.000008 0.000000 0.020533
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- --------- ------ ------ -------- -------- ---------
SQL*Net message from client 2.072877 90.9% 10,001 0.000207 0.000023 0.016861
direct path read 0.110575 4.9% 10,000 0.000011 0.000004 0.020533
FETCH 0.081993 3.6% 5,001 0.000016 0.000000 0.001000
SQL*Net message to client 0.008804 0.4% 10,003 0.000001 0.000000 0.000061
PARSE 0.003999 0.2% 2 0.001999 0.000000 0.003999
EXEC 0.001000 0.0% 2 0.000500 0.000000 0.001000
CLOSE 0.000000 0.0% 2 0.000000 0.000000 0.000000
--------------------------- --------- ------ ------ -------- -------- ---------
TOTAL (7) 2.279248 100.0% 35,011 0.000065 0.000000 0.020533
Trace file with no
water at all.
Doesn’t explain the
2.3-sec experience.
Trace file with rivers,
but no oceans.
Explains the 2.3-sec
experience exactly.
90.9% is rivers. Easy.
Trace file with oceans.
Find the 2.3-sec
experience.
43. @CaryMillsap
To Oracle, it’s all just water.
It sees no difference between salt water and fresh water,
between response-time SNMFC and non-response-time SNMFC.
It’s all just SQL*Net message from client.
43
WAIT ... nam='SQL*Net message from client' ela= 1202689 ...
stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 342
more stuff for Experience A
WAIT ... nam='SQL*Net message from client' ela= 1492
yet more stuff for experience A
etc.
WAIT ... nam='SQL*Net message from client' ela= 4260917 ...
stuff for Experience B
WAIT ... nam='SQL*Net message from client' ela= 2928
more stuff for Experience B
etc.
WAIT ... nam='SQL*Net message from client' ela= 5213365 ...
stuff for Experience C
WAIT ... nam='SQL*Net message from client' ela= 855
more stuff for Experience C
etc.
WAIT ... nam='SQL*Net message from client' ela= 2044420 ...
48. @CaryMillsap
SQL*Net message from client
if ela ≥ :b then ocean (not response time)
otherwise river (response time)
Sometimes you have to fine-tune the boundary value.
48
51. @CaryMillsap
For connection pooling apps, the
oceans-islands-rivers thing works pre y well.
51
But it’s not 100% reliable.
For example, what if you have a river that’s bigger than one of your oceans?
53. @CaryMillsap
If you can instrument your app,
it will automatically tell you
where the experience boundaries are.
53
54. @CaryMillsap
If you’re running code in an interactive
development environment, it’s easy:
54
1. activate trace;
2.
execute the code path
for the experience;
3. deactivate trace;
55. @CaryMillsap
If you’re running code in an interactive
development environment, it’s easy:
55
1. activate trace;
1.1. There must be NO LATENCY here.
2.
execute the code path
for the experience;
2.1. There must be NO LATENCY here.
3. deactivate trace;
57. @CaryMillsap
This is the best thing you can do:
Instrument your application so that the
trace data explains exactly one user
response time experience.
57
58. @CaryMillsap
You can fix a trace file that accounts
for more time than you want.
58
…E.g., if you’re stuck activating trace with dbms_monitor.session_trace_enable(:sid,:serial,true,true).
67. @CaryMillsap 67
*** 2011-08-18 14:36:21.576
*** SESSION ID:(23.42) 2011-08-18 14:36:21.576
*** CLIENT ID:() 2011-08-18 14:36:21.576
*** SERVICE NAME:(SYS$USERS) 2011-08-18 14:36:21.576
*** MODULE NAME:(SQL*Plus) 2011-08-18 14:36:21.576
*** ACTION NAME:() 2011-08-18 14:36:21.576
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1
p3=0 obj#=-1 tim=1313696181576631
*** 2011-08-18 14:36:41.698
WAIT #8: nam='SQL*Net message from client' ela= 20121507 driver id=1650815232
#bytes=1 p3=0 obj#=-1 tim=1313696201698518
CLOSE #8:c=0,e=41,dep=0,type=1,tim=1313696201698681
=====================
PARSING IN CURSOR #7 len=352 dep=1 uid=84 oct=3 lid=84 tim=1313696201699956
hv=2904344320 ad='3e4f6d48' sqlid='f70vdzaqjtjs0'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB)
opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB)
NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /
*+ NO_PARALLEL("T") FULL("T") NO_PARALLEL_INDEX("T") */ :"SYS_B_2" AS
C1, :"SYS_B_3" AS C2 FROM "T" "T") SAMPLESUB
END OF STMT
PARSE #7:c=0,e=402,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=1313696201699946
...
If you delete this line, then its 20.121507-
second contribution to the 20.122009
seconds between calls will be unexplained.
(1,313,696,201.698681 – 41) – 1,313,696,181.576631 = 20.122009
68. @CaryMillsap 68
*** 2011-08-18 14:36:21.576
*** SESSION ID:(23.42) 2011-08-18 14:36:21.576
*** CLIENT ID:() 2011-08-18 14:36:21.576
*** SERVICE NAME:(SYS$USERS) 2011-08-18 14:36:21.576
*** MODULE NAME:(SQL*Plus) 2011-08-18 14:36:21.576
*** ACTION NAME:() 2011-08-18 14:36:21.576
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1
p3=0 obj#=-1 tim=1313696181576631
*** 2011-08-18 14:36:41.698
WAIT #8: nam='SQL*Net message from client' ela= 20121507 driver id=1650815232
#bytes=1 p3=0 obj#=-1 tim=1313696201698518
CLOSE #8:c=0,e=41,dep=0,type=1,tim=1313696201698681
=====================
PARSING IN CURSOR #7 len=352 dep=1 uid=84 oct=3 lid=84 tim=1313696201699956
hv=2904344320 ad='3e4f6d48' sqlid='f70vdzaqjtjs0'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB)
opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB)
NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /
*+ NO_PARALLEL("T") FULL("T") NO_PARALLEL_INDEX("T") */ :"SYS_B_2" AS
C1, :"SYS_B_3" AS C2 FROM "T" "T") SAMPLESUB
END OF STMT
PARSE #7:c=0,e=402,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=1313696201699946
...
You can’t just delete this line (or set its ela
value to 0). You must also subtract
20.121507 seconds from every *** line and
tim value from there to the end of the file.
73. @CaryMillsap 73
References
h p://www.slideshare.net/carymillsap/how-to-find-and-fix-your
A free online presentation about how to instrument your application so it will automatically tell you
where the experience boundaries are.
h p://method-r.com/blogs/company-blog/214-finding-connection-pool-response-times-with-method-
r-tools
“Connection pool response times with Method R Tools (Oceans, Islands, and Rivers),” a blog post
explaining the oceans-islands-rivers metaphor.
h ps://motdcr3.eventbrite.com
“Mastering Oracle Trace Data free online class reunion,” to be held 11:00a–12:30p CST Thursday,
February 10, 2015.
h p://amzn.to/173bpzg
“The Method R Guide to Mastering Oracle Trace Data,” a textbook for the 1- to 2-day course that
covers Method R Corporation so ware and methods.
h p://method-r.com/so ware/m race
A Method R extension for Oracle SQL Developer. Method R Trace collects trace data and retrieves it
for you, automatically.
h p://method-r.com/so ware/m ools
A set of so ware tools for mining and manipulating Oracle extended SQL trace data. I use mrskew to
repo on durations of individual experiences recorded in extended SQL trace files. I use mrcallrm to
eliminate calls from my trace data. It automatically ripples the required tim and *** line changes
throughout a trace file.
h p://method-r.com/courses/mastering-oracle-trace-data
“Mastering Oracle Trace Data,” a 1- to 2-day course that covers Method R Corporation so ware and
methods.