2. Wie ben ik
● Rick van Ek, rick.v.ek@xs4all.nl
http://nl.linkedin.com/in/rickvek
● Werkt met Oracle producten sinds 1992
● Zelfstandig sinds 1996, Van Ek IT Consultancy BV
● Oracle database
● Baan IV software
● Web Logic (2012)
● Getrouwd, twee pubers, meisje en jongen.
3. Agenda
● Definitie van een Latch.
● Wat is een Latch?
● Hoe ziet een Latch er uit?
● Wanneer wordt deze gebruikt?
● Eigenschappen van een Latch.
● Welke informatie over een Latch?
● Latch contention
● Demo?
4. Definition of a latch by Oracle
Latches are simple, low-level serialization mechanisms that coordinate multiuser access to shared data structures,
objects, and files. Latches protect shared memory resources from corruption when accessed by multiple processes.
Specifically, latches protect data structures from the following situations:
Concurrent modification by multiple sessions
Being read by one session while being modified by another session
Deallocation (aging out) of memory while being accessed
Typically, a single latch protects multiple objects in the SGA. For example, background processes such as DBWn and
LGWR allocate memory from the shared pool to create data structures. To allocate this memory, these processes use a
shared pool latch that serializes access to prevent two processes from trying to inspect or modify the shared pool
simultaneously. After the memory is allocated, other processes may need to access shared pool areas such as the library
cache, which is required for parsing. In this case, processes latch only the library cache, not the entire shared pool.
Unlike enqueue latches such as row locks, latches do not permit sessions to queue. When a latch becomes available, the
first session to request the latch obtains exclusive access to it. Latch spinning occurs when a process repeatedly requests
a latch in a loop, whereas latch sleeping occurs when a process releases the CPU before renewing the latch request.
Typically, an Oracle process acquires a latch for an extremely short time while manipulating or looking at a data
structure. For example, while processing a salary update of a single employee, the database may obtain and release
thousands of latches. The implementation of latches is operating system-dependent, especially in respect to whether and
how long a process waits for a latch.
An increase in latching means a decrease in concurrency. For example, excessive hard parse operations create
contention for the library cache latch. The V$LATCH view contains detailed latch usage statistics for each latch,
including the number of times each latch was requested and waited for.
5. Wat is een Latch?
● Latch is een locking mechanisme
● Regelt toegang tot resources in de SGA, library
cache en database buffers e.d.
● Zorgt dat informatie consistent is voor shared
objects.
● Is razend snel en heeft geen intelligentie, geen
queuing. (nano seconden)
● Een latch is ongeveer 100 tot 200 bytes groot.
● Bij memory objects kan het gebeuren dat readers
writers blocken en vice versa.
8. Wanneer wordt deze gebruikt.
In principe wordt een latch altijd gebruikt als men
resources nodig heeft uit een de SGA. Er zijn dus veel
verschillende soorten latches (zie v$latch).
Twee phase actie bij lange operaties, dwz :
Get latch, pin buffer, unset latch, do changes, get
latch , unpin buffer, unset latch.
9. Eigenschappen van een Latch.
● Latches worden gebruikt gedurende de periode dat
een memory structure wordt ge-update.
● Ze hebben een extreem korte levensduur. Order
grote van nano seconden.
● Ze zijn atomic, d.w.z. “test en set” of “compare and
swap” CPU instructions.
● Doordat het een single instruction operation is, zijn
ze gegarandeerd voor het betreffende proces.
10. Eigenschappen van een Latch.
● Een latch heeft geen initiele “sleep” maar blijft proberen de
lock te krijgen “spinning” (paar duizend pogingen).
● Na het spinnen krijgt deze een sleep time.(verschillend)
● Een latch zit op dezelfde CPU, een context switch zou te
lang duren.
● Er is geen intelligent gedrag, daar is de tijd niet voor.
● Er is geen queue, als een latch vrij komt dan is deze voor de
eerste de beste(mob of waiters).
●
Als de “holder” er niet meer is maar de latch wel wordt dit
door PMON opgeschoond.
11. Eigenschappen van een Latch.
● Een latch kent twee types “willing to wait” en
“immediate”
● Type “immediate latch” wacht niet maar gaat op
zoek naar vrije child latches.
● Latches opereren op instance level. (RAC)
● De implementatie van latches zorgt ervoor dat er
geen deadlock kan ontstaan.
● Twee smaken, exclusive en shared.
● Een latch beslaat 32 of meer hash buckets.
12. Latch informatie.
● V$LATCH shows aggregate latch statistics for both
parent and child latches.
select latch#
, level#
, name
, gets
, misses
, sleeps
, immediate_gets
, immediate_misses
, wait_time -- "Wait microsec"
from v$latch ;
● Information from X$KSLLTR
13. Latch informatie.
● V$LATCH_PARENT shows aggregate latch
statistics for parent latches
● Information from X$KSLLTR_PARENT
● V$LATCH_CHILDREN shows aggregate latch
statistics for children latches
● Information from X$KSLLTR_CHILDREN
14. Latch informatie.
● V$LATCHHOLDER This view contains
information about the current latch holders.
– PID NUMBER Identifier of the process holding
the latch
– SID NUMBER Identifier of the session that owns
the latch
– LADDR RAW(4 | 8) Latch address
– NAME VARCHAR2(64) Name of the latch being
held
– GETS NUMBER Number of times that the latch
was obtained in either wait mode or no-wait
mode
● Gebaseerd op X$KSUPRLAT
15. Latch informatie.
● Wat houdt de latch vast? v.b. cache buffer chains
● In v$latch_children vindt je addr
● In x$bh vindt je hladdr, file# ,dbablck , state,
TCH
● v$latch_children.addr = x$bh.hladdr
● TCH = touch count ( updated elke 3 seconden)
16. Misleidende informatie.
● Er is een valkuil, een latch leeft in een cyclus van
nano seconden.
● De informatie (TCH) in x$bh wordt ieder 3
seconden ververst.
● v.b. 1 maal elke 3 sec gedurende 24 uur
TCB =28800 (86400/3 =28800)
● tig-maal voor 2 sec elke 10 sec gedurende 24
uur (tcb verhoogd met 1)
TCB = 8640 (86400/10= 8640)
● TCH counter wordt gebruikt voor de LRU process
17. Wat is contention?
● Als een latch gezet wordt maar er is al een latch dan “spins”
deze en probeert het weer.
● Het maximale aantal keren van “spinning” is vast gelegd in
_SPIN_COUNT (afhankelijk van CPU count)
● Daarna wacht het voor enkele honderdste van seconden en
probeert het weer.
● Na ieder poging loopt de wacht tijd iets op.
● CPU utilization loopt op gedurende dit proces.
● “! latch contention is a Sympthom not a root cause”
18. Specifieke latch contention.
● Redo copy/redo allocation latch
● Verkeerd geconfigureerde redo logfiles/buffers.
● Library Cache latch
● Literals in plaats van binds
● Cache Buffers Chains latch
● Hot blocks.
● Shared Pool latch.
● Te grote large pool en/of geen (te kleine)
reserved area.
19. Hoe identificeer je latch contention?
● Ratio based indentificatie.
● "willing-to-wait" Hit Ratio = (GETS-
MISSES)/GETS
● "no wait" Hit Ratio = (IMMEDIATE_GETS-
IMMEDIATE_MISSES)/IMMEDIATE_GETS
● Zie ook AWR/spotlight/lab128 etc.
20. Hoe identificeer je latch contention?
● Wait Interface Based Techniques
● Meet de impact van de latches op je overal
performance.
● Kijk hoeveel tijd er gespendeerd word in het
wachten op een latch.
– v$system_event
– v$sysstat
– v$latch
21. Wie houd deze latch?
● Latch contention is een symptoom, dus;
● Volg/begrijp het process
● v$latchholder => sid, pid
● v$session => sql_address
● v$sqlarea => sql_text
● v$latch_children => addr
● X$BH => haddr
● Gebruik tools:
● detectie
– AWR
– Spotlight
– lab128
● onderzoek
– Latchprof / latchprofx (session)
22. Latch onderzoek
● Latchprof
● Gebaseerd op v$ views
● Latchprofx
● Gebaseerd op x$ views
● Meer details
● Toegang tot x views nodig
● Parameter: _ultrafast_latch_statistics
23. Latch onderzoek
● Parameter 1 specifies which columns from V$LATCHHOLDER to report and
group by. In the case below I just want to report latch holds by latch name (and not
even break it down by SID for starters).
● Parameter 2 specifies which SIDs to monitor. In the case below, I am interested in
any SID which holds a latch (%).
● Parameter 3 specifies which latches to monitor. This can be set either to latch name
or latch address in memory. All latches (%).
● Parameter 4 specifies how many times to sample V$LATCHHOLDER. The
sampling speed depends on your server CPU/memory bus speed and the value of
processes parameter. You should start from lower number like 1000 and adjust it so
that LatchProf would complete its sampling in a couple of seconds, and that is
usually enough for diagnosing ongoing latch contention problems. You shouldn't
keep sampling for long periods since LatchProf runs constantly on the CPU.
24. Latch onderzoek
● Name - Latch name
● Held - During how many samples out of total samples (100000) the
particular latch was held by somebody
● Gets - How many latch gets against that latch were detected during
LatchProf sampling
● Held % - How much % of time was the latch held by somebody during the
sampling. This is the main column you want to be looking at in order to see
who/what holds the latch the most (the latchprof output is reverse-ordered by that
column)
● Held ms - How many milliseconds in total was the latch held during the
sampling
● Avg hold ms - Average latch hold time in milliseconds (normally latches are held
from a few to few hundred microseconds)
25. Conclusie
● Latch contention en CPU utilization gaan samen
● Kan veroorzaakt worden door CPU starvation.
● Latchholder is de weg naar de bron
● Kan ook gebruikt worden om hot blocks te
detecteren.
● Geeft impact van gebruik van literals.
● Be-invloed scalability
● Veel informatie maar erg verspreidt.
27. Mutex in het kort
● Opvolger van de latch
● Is nog kleiner en sneller
● Nog minder informatie opgeslagen
● Introductie in Oracle 10g
● Iedere hash bucket eigen mutex
● Beter schaalbaar
30. Wie ben ik
● Rick van Ek, rick.v.ek@xs4all.nl
http://nl.linkedin.com/in/rickvek
● Werkt met Oracle producten sinds 1992
● Zelfstandig sinds 1996, Van Ek IT Consultancy BV
● Oracle database
● Baan IV software
● Web Logic (2012)
● Getrouwd, twee pubers, meisje en jongen.
31. Agenda
● Definitie van een Latch.
● Wat is een Latch?
● Hoe ziet een Latch er uit?
● Wanneer wordt deze gebruikt?
● Eigenschappen van een Latch.
● Welke informatie over een Latch?
● Latch contention
● Demo?
32. Definition of a latch by Oracle
Latches are simple, low-level serialization mechanisms that coordinate multiuser access to shared data structures,
objects, and files. Latches protect shared memory resources from corruption when accessed by multiple processes.
Specifically, latches protect data structures from the following situations:
Concurrent modification by multiple sessions
Being read by one session while being modified by another session
Deallocation (aging out) of memory while being accessed
Typically, a single latch protects multiple objects in the SGA. For example, background processes such as DBWn and
LGWR allocate memory from the shared pool to create data structures. To allocate this memory, these processes use a
shared pool latch that serializes access to prevent two processes from trying to inspect or modify the shared pool
simultaneously. After the memory is allocated, other processes may need to access shared pool areas such as the library
cache, which is required for parsing. In this case, processes latch only the library cache, not the entire shared pool.
Unlike enqueue latches such as row locks, latches do not permit sessions to queue. When a latch becomes available, the
first session to request the latch obtains exclusive access to it. Latch spinning occurs when a process repeatedly requests
a latch in a loop, whereas latch sleeping occurs when a process releases the CPU before renewing the latch request.
Typically, an Oracle process acquires a latch for an extremely short time while manipulating or looking at a data
structure. For example, while processing a salary update of a single employee, the database may obtain and release
thousands of latches. The implementation of latches is operating system-dependent, especially in respect to whether and
how long a process waits for a latch.
An increase in latching means a decrease in concurrency. For example, excessive hard parse operations create
contention for the library cache latch. The V$LATCH view contains detailed latch usage statistics for each latch,
including the number of times each latch was requested and waited for.
33. Wat is een Latch?
● Latch is een locking mechanisme
● Regelt toegang tot resources in de SGA, library
cache en database buffers e.d.
● Zorgt dat informatie consistent is voor shared
objects.
● Is razend snel en heeft geen intelligentie, geen
queuing. (nano seconden)
● Een latch is ongeveer 100 tot 200 bytes groot.
● Bij memory objects kan het gebeuren dat readers
writers blocken en vice versa.
Latches are simple, low-level serialization mechanisms that coordinate multiuser
access to shared data structures, objects, and files. Latches protect shared
memory resources from corruption when accessed by multiple processes.
Specifically, latches protect data structures from the following situations:
Concurrent modification by multiple sessions
Being read by one session while being modified by another session
Deallocation (aging out) of memory while being accessed
Typically, a single latch protects multiple objects in the SGA. For example,
background processes such as DBWn and LGWR allocate memory from the
shared pool to create data structures. To allocate this memory, these processes
use a shared pool latch that serializes access to prevent two processes from trying
to inspect or modify the shared pool simultaneously. After the memory is allocated,
other processes may need to access shared pool areas such as the library cache,
which is required for parsing. In this case, processes latch only the library cache,
not the entire shared pool.
Unlike enqueue latches such as row locks, latches do not permit sessions to queue.
When a latch becomes available, the first session to request the latch obtains
exclusive access to it. Latch spinning occurs when a process repeatedly requests
a latch in a loop, whereas latch sleeping occurs when a process releases the CPU
before renewing the latch request.
Typically, an Oracle process acquires a latch for an extremely short time while
manipulating or looking at a data structure. For example, while processing a salary
update of a single employee, the database may obtain and release thousands of
latches. The implementation of latches is operating system-dependent, especially
in respect to whether and how long a process waits for a latch.
An increase in latching means a decrease in concurrency. For example, excessive
hard parse operations create contention for the library cache latch. The V$LATCH
view contains detailed latch usage statistics for each latch, including the number of
times each latch was requested and waited for.
34. Hoe ziet een latch er uit?
Memory build up:
-Arrays
No addressing
Fixed sizes
Segmented Arrays - dynamic allocation
- hold address of next in list
-Pointers
Memory location
Hold address to interesting piece of memory
- Linked lists
List of acciated data
Varying in shapes/size
Frequently/heavily used
Double linked lists - forward address
- backward address
- Hash table
Hashvalue (bucket)
Different values hash to the same bucket
Not a lot of items to same bucktet
Hash algorithm spread data evenly
Object always to same bucket
36. Wanneer wordt deze gebruikt.
In principe wordt een latch altijd gebruikt als men
resources nodig heeft uit een de SGA. Er zijn dus veel
verschillende soorten latches (zie v$latch).
Twee phase actie bij lange operaties, dwz :
Get latch, pin buffer, unset latch, do changes, get
latch , unpin buffer, unset latch.
Since “doing something” with the buffer content can take a relatively long time, Oracle
often adopts a two-step strategy to latching so that it doesn’t have to hold a latch
while working. There are some operations that can be completed while holding the
latch, but Oracle often uses the following strategy:
1. Get the latch
2. Find and pin the buffer.
3. Drop the latch.
4. do something with the buffer content.
5. Get the latch.
6. Unpin the buffer.
7. Get the latch.
8. Drop the latch.
By Jonathan Lewis.
37. Eigenschappen van een Latch.
● Latches worden gebruikt gedurende de periode dat
een memory structure wordt ge-update.
● Ze hebben een extreem korte levensduur. Order
grote van nano seconden.
● Ze zijn atomic, d.w.z. “test en set” of “compare and
swap” CPU instructions.
● Doordat het een single instruction operation is, zijn
ze gegarandeerd voor het betreffende proces.
38. Eigenschappen van een Latch.
● Een latch heeft geen initiele “sleep” maar blijft proberen de
lock te krijgen “spinning” (paar duizend pogingen).
● Na het spinnen krijgt deze een sleep time.(verschillend)
● Een latch zit op dezelfde CPU, een context switch zou te
lang duren.
● Er is geen intelligent gedrag, daar is de tijd niet voor.
● Er is geen queue, als een latch vrij komt dan is deze voor de
eerste de beste(mob of waiters).
●
Als de “holder” er niet meer is maar de latch wel wordt dit
door PMON opgeschoond.
Geen intelligentie nodig voor multiple cpu omdat door het spinnen het process op
dezelfde cpu blijft.
Sleep tussen spins is afhankelijk van het aantal CPU's
39. Eigenschappen van een Latch.
● Een latch kent twee types “willing to wait” en
“immediate”
● Type “immediate latch” wacht niet maar gaat op
zoek naar vrije child latches.
● Latches opereren op instance level. (RAC)
● De implementatie van latches zorgt ervoor dat er
geen deadlock kan ontstaan.
● Twee smaken, exclusive en shared.
● Een latch beslaat 32 of meer hash buckets.
Immediate gaat onmiddelijk zoeken naar een ander pad om lock alsnog te verkrijgen
Exclusive betekend ook exclusief, er kan maar een gebruiker/waiter zijn van de latch.
40. Latch informatie.
● V$LATCH shows aggregate latch statistics for both
parent and child latches.
select latch#
, level#
, name
, gets
, misses
, sleeps
, immediate_gets
, immediate_misses
, wait_time -- "Wait microsec"
from v$latch ;
● Information from X$KSLLTR
V$LATCH
Shows aggregate latch statistics for both parent and child latches, grouped by latch name. Individual
parent and child latch statistics are broken down in the views:
V$LATCH_PARENT
V$LATCH_CHILDREN
.
Key information in these views is:
GETS - Number of successful willing-to-wait requests for a latch.
MISSES - Number of times an initial willing-to-wait request was unsuccessful
SLEEPS - Number of times a process waited for requested a latch after an initial wiling-to-wait
request.
IMMEDIATE_GETS - Number of successful immediate requests for each latch.
IMMEDIATE_MISSES Number of unsuccessful immediate requests for each latch.
.
V$LATCHNAME
contains information about decoded latch names for the latches shown in V$LATCH
.
Oracle versions might differ in the latch# assigned to the existing latches.In order to obtain information
for the specific version query as follows:
clm aefra 4 edn LTHNM'
ounnm omta0haig'AC AE
slc ac# aefo $acnm;
eetlth,nm rmvlthae
.
V$LATCHHOLDER
contains information about the current latch holders.
.
(Metalink [ID 22908.1])
41. Latch informatie.
● V$LATCH_PARENT shows aggregate latch
statistics for parent latches
● Information from X$KSLLTR_PARENT
● V$LATCH_CHILDREN shows aggregate latch
statistics for children latches
● Information from X$KSLLTR_CHILDREN
42. Latch informatie.
● V$LATCHHOLDER This view contains
information about the current latch holders.
– PID NUMBER Identifier of the process holding
the latch
– SID NUMBER Identifier of the session that owns
the latch
– LADDR RAW(4 | 8) Latch address
– NAME VARCHAR2(64) Name of the latch being
held
– GETS NUMBER Number of times that the latch
was obtained in either wait mode or no-wait
mode
● Gebaseerd op X$KSUPRLAT
43. Latch informatie.
● Wat houdt de latch vast? v.b. cache buffer chains
● In v$latch_children vindt je addr
● In x$bh vindt je hladdr, file# ,dbablck , state,
TCH
● v$latch_children.addr = x$bh.hladdr
● TCH = touch count ( updated elke 3 seconden)
This latch has a memory address, identified by the ADDR column.
SELECT
addr,
sleeps
FROM
v$latch_children c,
v$latchname n
WHERE
n.name='cache buffers chains' and
c.latch#=n.latch# and
sleeps > 100
ORDER BY sleeps
/
Use the value in the ADDR column joined with the V$BH view to identify the blocks
protected by this latch. For example, given the address
(V$LATCH_CHILDREN.ADDR) of a heavily contended latch, this queries the file and
block numbers:
SELECT file#, dbablk, class, state, TCH
FROM X$BH
WHERE HLADDR='address of latch';
X$BH.TCH is a touch count for the buffer. A high value for X$BH.TCH indicates a hot
block.
44. Misleidende informatie.
● Er is een valkuil, een latch leeft in een cyclus van
nano seconden.
● De informatie (TCH) in x$bh wordt ieder 3
seconden ververst.
● v.b. 1 maal elke 3 sec gedurende 24 uur
TCB =28800 (86400/3 =28800)
● tig-maal voor 2 sec elke 10 sec gedurende 24
uur (tcb verhoogd met 1)
TCB = 8640 (86400/10= 8640)
● TCH counter wordt gebruikt voor de LRU process
But still, it would not be always reliable for another reason – touchcounts are incremented only after 3
seconds have passed since last increment! This factor has been coded in to avoid situation such a short
but crazy nested loop join hammering a single buffer hundreds of thousands of times in few seconds
and then finishing. The buffer wouldn’t be hot anymore but the touchcount would be hundreds of
thousands due a single SQL execution. So, unless 3 seconds (of SGA internal time) has passed since
last TCH update, the touchcounts would not be increased during buffer access.
This time is controlled by SGA variable kcbatt_ by the way:
SQL> oradebug dumpvar sga kcbatt_
ub4 kcbatt_ [3C440F4, 3C440F8) = 00000003
This 3-second delay leaves us in the following situation, let say there are 2 blocks protected by a CBC child
latch:
One block has been accessed once every 3 seconds for 24 hours in a row. A block accessed once per 3
seconds is definitely not a hot block, but its touchcount would be around 28800 (86400 seconds per 24
hours / 3 = 28800).
And there is another block which is accessed crazily for 2 seconds in a row and this happens every 10
seconds. 2 seconds of consecutive access would increase the touchcount by 1. If such access pattern
has been going on every 10 seconds over last 24 hours, then the touch count for that buffer would be
86400 / 10 = 8640.
In the first case we can have a very cold block with TCH = 28800 and in second case a very hot block with
TCH = 8640 only and this can mislead DBAs to fixing the wrong problem.
(Tanel Poder)
45. Wat is contention?
● Als een latch gezet wordt maar er is al een latch dan “spins”
deze en probeert het weer.
● Het maximale aantal keren van “spinning” is vast gelegd in
_SPIN_COUNT (afhankelijk van CPU count)
● Daarna wacht het voor enkele honderdste van seconden en
probeert het weer.
● Na ieder poging loopt de wacht tijd iets op.
● CPU utilization loopt op gedurende dit proces.
● “! latch contention is a Sympthom not a root cause”
What causes latch contention?
If a required latch is busy, the process requesting it spins, tries again and if still unavailable, spins again.
The
loop is repeated up to a maximum number of times determined by the hidden initialization parameter
_SPIN_COUNT. The default value of the parameter is automatically adjusted when the machine's CPU
count
changes provided that the default was used. If the parameter was explicitly set then there is no change. It is
not usually recommended to change the default value for this parameter.
If after this entire loop, the latch is still not available, the process must yield the CPU and go to sleep. Initially
it sleeps for one centisecond. This time is doubled in every subsequent sleep. This causes a slowdown to
occur and results in additional CPU usage,until a latch is available. The CPU usage is a consequence of the
"spinning" of the process. "Spinning" means that the process continues to look for the availability of the latch
after certain intervals of time, during which it sleeps.
46. Specifieke latch contention.
● Redo copy/redo allocation latch
● Verkeerd geconfigureerde redo logfiles/buffers.
● Library Cache latch
● Literals in plaats van binds
● Cache Buffers Chains latch
● Hot blocks.
● Shared Pool latch.
● Te grote large pool en/of geen (te kleine)
reserved area.
CAUSES OF CONTENTION FOR SPECIFIC LATCHES
The latches that most frequently affect performance are those protecting the buffer
cache, areas of the shared pool and the redo buffer.
• Library cache latches: These latches protect the library cache in which sharable
SQL is stored. In a well defined application there should be little or no
contention for these latches, but in an application that uses literals instead of
bind variables (for instance “WHERE surname=’HARRISON’” rather that
“WHERE surname=:surname”, library cache contention is common.
• Redo copy/redo allocation latches: These latches protect the redo log buffer,
which buffers entries made to the redo log. Recent improvements
(from Oracle 7.3 onwards) have reduced the frequency and severity of
contention for these latches.
• Shared pool latches: These latches are held when allocations or de-allocations
of memory occur in the shared pool. Prior to Oracle 8.1.7, the most common
cause of shared pool latch contention was an overly large shared pool and/or
failure to make use of the reserved area of the shared poolii.
• Cache buffers chain latches: These latches are held when sessions read or
write to buffers in the buffer cache. In Oracle8i, there are typically a very large
number of these latches each of which protects only a handful of blocks.
Contention on these latches is typically caused by concurrent access to a very
“hot” block and the most common type of such a hot block is an index root or
branch block (since any index based query must access the root block).
47. Hoe identificeer je latch contention?
● Ratio based indentificatie.
● "willing-to-wait" Hit Ratio = (GETS-
MISSES)/GETS
● "no wait" Hit Ratio = (IMMEDIATE_GETS-
IMMEDIATE_MISSES)/IMMEDIATE_GETS
● Zie ook AWR/spotlight/lab128 etc.
Select name
, gets, misses
, {gets - misses)/gets ratio
From v$latch
Where gets>0;
select name
, immediate_gets
, immediate_misses
, (immediate_gets -
immediate_misses)/immediate_gets
ratio
from v$latch
where immediate_gets > 0 ;
48. Hoe identificeer je latch contention?
● Wait Interface Based Techniques
● Meet de impact van de latches op je overal
performance.
● Kijk hoeveel tijd er gespendeerd word in het
wachten op een latch.
– v$system_event
– v$sysstat
– v$latch
A better approach to estimating the impact of latch contention is to consider the relative
amount of time being spent waiting for latches. The following query gives us some
indication of this:
SELECT event
, time_waited
, round(time_waited*100/ SUM (time_waited) OVER(),2) wait_pct
FROM (SELECT event, time_waited
FROM v$system_event
WHERE event NOT IN ('Null event',
'client message',
'rdbms ipc reply',
'smon timer',
'rdbms ipc message',
'PX Idle Wait',
'PL/SQL lock timer',
'file open',
'pmon timer',
'WMON goes to sleep',
'virtual circuit status',
'dispatcher timer',
'SQL*Net message from client',
'parallel query dequeue wait',
'pipe get )
UNION
(SELECT NAME, VALUE
FROM v$sysstat
WHERE NAME LIKE 'CPU used when call started'))
ORDER BY 2 DESC ;
select name, gets, sleeps,
sleeps*100/sum(sleeps) over() sleep_pct, sleeps*100/gets
sleep_rate
from v$latch where gets>0
order by sleeps desc;
49. Wie houd deze latch?
● Latch contention is een symptoom, dus;
● Volg/begrijp het process
● v$latchholder => sid, pid
● v$session => sql_address
● v$sqlarea => sql_text
● v$latch_children => addr
● X$BH => haddr
● Gebruik tools:
● detectie
– AWR
– Spotlight
– lab128
● onderzoek
– Latchprof / latchprofx (session)
Latch contention is niet altijd slecht. Het betekent gewoon dat men veel resources nodig
heeft. Heeft wel invloed op de schaalbaarheid van een application.
50. Latch onderzoek
● Latchprof
● Gebaseerd op v$ views
● Latchprofx
● Gebaseerd op x$ views
● Meer details
● Toegang tot x views nodig
● Parameter: _ultrafast_latch_statistics
MUTEXES, PART 1
A brief comment about mutexes is necessary at this point because a mutex is very
similar to a latch in the way it is implemented and used. Mutexes were introduced in
the library cache processing in Oracle 10.2 as a step toward eliminating the use of
pins (which I will discuss in conjunction with library cache locking toward the end of
the following section). Essentially a mutex is a “private mini-latch” that is part of the
library cache object. This means that instead of a small number of latches covering a
large number of objects—with the associated risk of competition for latches—we now
have individual mutexes for every single library cache hash bucket, and two mutexes
(one to replace the KGL pin, the other related in some way to handling dependencies)
on every parent and child cursor, which should improve the scalability of frequently
executed statements. The downside to this change is that we have less information if
problems arise. The support code for latching contains a lot of information about who,
what, where, when, why, how often, and how much contention appeared. The code
path for operating mutexes is shorter, and captures less of this information.
Nevertheless, once you’ve seen how (and why) Oracle operates locking and pinning
in the library cache, you will recognize the performance benefits of mutexes.
Read the full story : Oracle Core: Essential Internals for DBAs and Developers
Jonathan Lewis.
51. Latch onderzoek
● Parameter 1 specifies which columns from V$LATCHHOLDER to report and
group by. In the case below I just want to report latch holds by latch name (and not
even break it down by SID for starters).
● Parameter 2 specifies which SIDs to monitor. In the case below, I am interested in
any SID which holds a latch (%).
● Parameter 3 specifies which latches to monitor. This can be set either to latch name
or latch address in memory. All latches (%).
● Parameter 4 specifies how many times to sample V$LATCHHOLDER. The
sampling speed depends on your server CPU/memory bus speed and the value of
processes parameter. You should start from lower number like 1000 and adjust it so
that LatchProf would complete its sampling in a couple of seconds, and that is
usually enough for diagnosing ongoing latch contention problems. You shouldn't
keep sampling for long periods since LatchProf runs constantly on the CPU.
52. Latch onderzoek
● Name - Latch name
● Held - During how many samples out of total samples (100000) the
particular latch was held by somebody
● Gets - How many latch gets against that latch were detected during
LatchProf sampling
● Held % - How much % of time was the latch held by somebody during the
sampling. This is the main column you want to be looking at in order to see
who/what holds the latch the most (the latchprof output is reverse-ordered by that
column)
● Held ms - How many milliseconds in total was the latch held during the
sampling
● Avg hold ms - Average latch hold time in milliseconds (normally latches are held
from a few to few hundred microseconds)
53. Conclusie
● Latch contention en CPU utilization gaan samen
● Kan veroorzaakt worden door CPU starvation.
● Latchholder is de weg naar de bron
● Kan ook gebruikt worden om hot blocks te
detecteren.
● Geeft impact van gebruik van literals.
● Be-invloed scalability
● Veel informatie maar erg verspreidt.
55. Mutex in het kort
● Opvolger van de latch
● Is nog kleiner en sneller
● Nog minder informatie opgeslagen
● Introductie in Oracle 10g
● Iedere hash bucket eigen mutex
● Beter schaalbaar