The document discusses direct access to Oracle's shared memory (SGA) using C code. It describes the main regions of the SGA, how information is used automatically and by queries, and reasons for direct access such as reading hidden information or during database hangs. It outlines examining the SGA contents through externalized X$ tables, with most structures in the SGA not directly visible. The document provides a procedure to summarize waits from the V$SESSION_WAIT view by mapping it to the underlying X$KSUSECST table fields and offsets.
1. Oaktable
Jonathan Lewis and ORACLE_TRACE
Oracle_Trace crashes my Database
I start the SGA attach by searching every offset
Anjo Kolk says James Morle wrote a program using
x$ksmmem
I show James my first draft using x$ksmmem
James is baffled by why I'm hard coding offsets
James says the offsets are in some X$ table
I search, turn up a mail by Jonathan Lewis on x$kqfco
Goldmine – all the offsets
Thanks Mogens Nogard!
Thanks to TomKyte's Decimal to Hex
4. Direct Oracle SGA Memory
Access
Reading data directly from
Oracle’s shared memory segment
using C code
Wednesday, February 20, 2013
5. SGA on UNIX
SMON S nnn
D nnn P nnn
PMON
SGA CKPT
Redo Log
Shared Pool Database Buffer Cache Buffer
DBWR
ARCH
LGWR
oracle sqlplus Machine
Memory
6. SGA on NT
S nnn D nnn P nnn CKPT
SMON Machine
Shared Pool Database Buffer Cache
Redo Log
Buffer Memory
PMON
DBWR
LGWR ARCH oracle
Process Space sqlplus
7. What is the SGA
Memory Cache
Often Used Data
Rapid Access
Shareable
Concurrently Access
8. SGA 4 main regions
Fixed information
– Users info
– Database statistics
– X$dual
– etc
Data block cache
SQL cache ( library cache/shared pool)
Redo log buffer
9. How is the SGA info Used?
Automatically
– data blocks cached
– Log buffer
– Sql cache
– Updates of system and user statistics
User Queries
– User info v$session
– System info v$parameter
– Performance statistics v$sysstat, v$latch, v$system_event
– Buffer cache headers, x$bh
10. Why Direct Access with C?
Reading Hidden Information
– Sort info on version 7
– OPS locking info version 8
– Contents of data blocks (only the headers or visible in X$)
Access while Database is Hung
High Speed Access
– Sampling User Waits, catch ephemeral data
– Scan LRU chain in X$bh
– Statistically approximate statistics
SQL statistics per user
Low overhead
11. Database Slow or Hung
Often happens at the largest sites when cutting
edge support is expected.
Shared Pool errors ORA 4031
Archiver or Log file Switch Hangs
Hang Bugs
Library Cache Latch contention
ORA-00379: no free buffers available in buffer
pool DEFAULT
12. Statistical Sampling
By Rapidly Sampling SQL statistics
and the users who have the statistics
open, one can see how much work a
particular user does with a particular
SQL statement
13. Low Overhead
Marketing Appeal
Clients are sensitive about their production
databases
Heisenberg uncertainty affect – less overhead
less affect monitoring has on performance
which we are monitoring
14. SGA made visible through x$tables
Most of the SGA is not visible
X$KSMMEM Exception, Raw Dump of SGA
Information Externalized through X$ tables
Useful or Necessary information is Externalized
Externalized publicly through V$ Tables
21. Externalization of C structs: X$
tables
If Structure foo was externalized in a X$
SQL> describe x$foo
Column Name Type
------------------------------ --------
ADDR RAW(8)
INDX NUMBER
ID NUMBER
B NUMBER
22. SGA is One Large C Struct
struct foo
{
int id;
int A;
int B;
int C;
};
struct foo foo[N];
23. Struct C code
#include <stdio.h>
#include <fcntl.h>
#define N 20
/* structure definition: */
struct foo
{
int id;
int a;
int b;
int c;
};
/* end structure definition */
24. Struct Record
main(){
struct foo foo[20];
int fptr;
/* zero out memory of struct */
memset(foo,0,sizeof(foo));
foo[0].id=1; /* row 0 */
foo[0].a=12;
foo[0].b=13;
foo[0].c=13;
30. Struct File Contents
Address is in Hex
Column 2 is the ID
Column 3 is field A
Column 4 is field B
Column 5 is field C
31. X$ tables ?
Ok, x$foo =~ foo[20]
How do I get a list of x$ tables?
Where is each X$ located?
V$Fixed_Tables
32. V$Fixed_Table – list of X$ tables
SQL> desc v$fixed_table;
Name Null? Type
----------------------------------------- -------- -----------------
NAME VARCHAR2(30)
OBJECT_ID NUMBER
TYPE VARCHAR2(5)
TABLE_NUM NUMBER
34. V$Fixed_Table
spool addr.sql
select
'select 'addr, ||''''||name||''''||' from ' || name ||' where
rownum < 2;'
from
v$fixed_table
where
name like 'X%'
/
spool off
@addr.sql
35. Example: finding the address
select
a.addr ,
'X$KSUSE'
from
X$KSUSE
where
rownum < 2 ;
37. What's in these X$ views
V$ views are documented
V$ views are based often on X$ tables
The map from v$ to X$ is described in :
V$Fixed_View_Definition
39. Definition of V$Session_Wait
SQL> select
VIEW_DEFINITION
from
V$FIXED_VIEW_DEFINITION
where
view_name='GV$SESSION_WAIT';
VIEW_DEFINITION
-----------------------------------------------------------------------
select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.
ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,
0,0,-1,-1,-2,-2, decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000)))
, s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME', -1, '
WAITED SHORT TIME', 'WAITED KNOWN TIME') from x$ksusecst s, x$ksled e where bit
and(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussop
c=e.indx
40. The Fields in X$ tables
OK, I've picked an X$
I've got the starting address
Now, how do I get the fields?
41. X$KQFTA
Kernel Query Fixed_view Table
INDX use to find column information
KQFTANAM X$ table names
42. X$KQFCO
Kernel Query Fixed_view Column
KQFCOTAB Join with X$KQFTA.INDX
KQFCONAM Column name
KQFCOOFF Offset from beginning of the row
KQFCOSIZ Columns size in bytes
44. SGA Contents in Resume
In resume:
Oracle takes the C structure defining the
SGA and maps it onto a shared memory
segment
Memory address Increasing
0x800000
0
Fixed SGA Buffer Redo Library
Cache Buffer Cache
Oracle provides access to some of the SGA
contents via X$ tables
45. **** Procedure *****
1. Choose a V$ view
2. Find base X$ Tables for v$ view
3. Map X$ fields to V$ fields
4. Get address of X$ table in SGA
5. Get the size of each record in X$ table
6. Get the number of records in X$ table
7. Get offsets for each desired field in X$ table
8. Get the base address of SGA
46. 1) V$SESSION_WAIT Example
List of all users waiting
Detailed information on the waits
Data is ephemeral
Useful in Bottleneck diagnostics
High sampling rate candidate
Event 10046 captures this info
Good table for SGA sampling
48. V$SESSION_WAIT Short
SQL> desc v$session_wait
Name Type
---------------------------- -------------
SID NUMBER
SEQ# NUMBER
EVENT VARCHAR2(64)
P1 NUMBER
P2 NUMBER
P3 NUMBER)
52. 2) V$SESSION_WAIT Based on
X$KSUSECT
VIEW_DEFINITION
---------------------------------------------------
-
select
indx,
ksussseq,
ksussopc,
ksussp1,
ksussp2,
ksussp3
from
x$ksusecst
53. Equivalent SQL Statements
select select
indx, sid
ksussseq, seq#
ksussopc, event
ksussp1, p1
ksussp2, p2
ksussp3 p3
from from
x$ksusecst v$session_wait )
Note: x$ksusecst. Ksussopc is the event #
x$ksled.kslednam is a list of the event names where
x$ksled.indx = x$ksusecst. ksussopc
55. 4) Get base SGA address for X$ table
Find the location of X$KSUSECST in the SGA
SQL> select addr from x$ksusecst where rownum < 2
ADDR
--------
85251EF4
56. 5) Find the Size of Each Record
SQL> select
((to_dec(e.addr)-to_dec(s.addr))) row_size
from
(select addr from x$ksusecst where rownum < 2) s,
(select max(addr) addr from x$ksusecst where rownum < 3) e ;
ROW_SIZE
----------------
2328
57. 6) Find the Number of Records in the
structure
SQL> select count(*) from x$ksusecst ;
COUNT(*)
--------------
170
58. Get Offsets for Each Desired Field in X$ table
SQL> select c.kqfconam field_name,
c.kqfcooff offset,
c.kqfcosiz sz
from
x$kqfco c,
x$kqfta t
where
t.indx = c.kqfcotab and
t.kqftanam='X$KSUSECST'
order by
offset
;
59. X$KQFTA - X$ Tables Names
List of X$ tables
INDX use to find column information
KQFTANAM X$ table names
To get Column information join with X$KQFCO
X$KQFTA.INDX = X$KQFCO.KQFCOTAB
60. X$KQFCO – X$ Table Columns
List of all the columns in X$ Tables
KQFCOTAB Join with X$KQFTA.INDX
KQFCONAM Column name
KQFCOOFF Offset from beginning of the row
KQFCOSIZ Columns size in bytes
62. What are all the fields at OFFSET 0?
These are all calculated values and not stored
explicitly in the SGA.
ADDR memory address
INDX record number, like rownum
INST_ID database instance ID
KSUSEWTM calculated field
63. Unexposed Fields
What happens between OFFSET 1 and 1276?
• Unexposed Fields
• Sometimes exposed elsewhere, in our case
• V$SESSION
• V$SESSTAT
64. Fields at Same Address
Why do some fields start at the same address?
KSUSSP1
KSUSSP1R
Are at the same address
Equivalent of
V$SESSION_WAIT.P1
V$SESSION_WAIT.P1RAW
These are the same data, just exposed as
Hex
Decimal
73. Attaching to the SGA
UNIX System Call “shmat”
To attach to shared memory Unix as a system
call
void *shmat( int shmid,
const void *shmaddr,
int shmflg );
74. ID and Address arguments to “shmat”
The arguments are:
shmid – shared memory identifier specified
shmaddr – starting address of the shared memory
shmflg - flags
The argument shmflg can be set to SHM_RDONLY . To
avoid any possible data corruption the SGA should only
be attached read only.
The arguments shmid and shmaddr need to be set to
Oracle’s SGA id and address.
75. Finding Oracle SGA’s ID and
Address
Use ORADEBUG to find the SGA id
SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.
76. Finding Trace File
SQL> show parameters user_dump
NAME VALUE
----------------------- --------------------------------
user_dump_dest /u02/app/oracle/admin/V901/udump
SQL> exit
$ cd /u02/app/oracle/admin/V901/udump
$ ls -ltr | tail -1
-rw-r----- usupport dba Aug 24 18:01 v901_ora_23179.trc
77. Finding SHMID in Trace File
$ vi v901_ora_23179.trc
…
Total size 004456c Minimum Subarea size 00000000
Area Subarea Shmid Stable Addr Actual Addr
0 0 34401 0080000000 0080000000
…
78. Attaching to the SGA
Shmid 34401
Shmaddr 0x80000000
Shmflg SHM_RDONLY
The SGA attach call in C would be:
Shmat(34401, 0x80000000, SHM_RDONLY);
This call needs to be executed as a UNIX user who has
read permission to the Oracle SGA
79. C Code Headers
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <errno.h>
#include "event.h"
event.h is for translating the event #s into event
names
81. Define Base Addresses and Sizes
/* SGA BASE ADDRESS */
#define SGA_BASE 0x80000000
/* START ADDR of KSUSECST(V$SESSION_WAIT) */
#define KSUSECST_ADDR 0x85251EF4
/* NUMBER of ROWS/RECORDS in KSUSECST */
#define SESSIONS 150
/* SIZE in BYTES of a ROW in KSUSECST */
#define RECORD_SZ 2328
83. Set Up Variables
main(argc, argv)
int argc;
char **argv;
{
void *addr;
int shmid;
int shmaddr;
void *current_addr;
long p1r, p2r, p3r;
unsigned int i, seq, tim, flg, evn;
84. Attach to SGA
/* ATTACH TO SGA */
shmid=atoi(argv[1]);
shmaddr=SGA_BASE;
if (
(void *)shmat(
shmid,
(void *)shmaddr,
SHM_RDONLY)
== (void *)-1 ) {
printf("shmat: error attatching to SGAn");
exit();
}
85. Set Up Sampling Loop
/* LOOP OVER ALL SESSIONS until CANCEL */
while (1) {
/* set current address to beginning of Table */
current_addr=(void *)KSUSECST_ADDR;
sleep(1);
printf("^[[H ^[[J"); /* clear screen */
/* print page heading */
printf("%4s %8s %-20.20s %10s %10s %10s n",
"sid", "seq", "wait","p1","p2","p3");
86. Loop over all Sessions
for ( i=0; i < SESSIONS ; i++ ) {
seq=*(unsigned short *)((int)current_addr+KSUSSSEQ);
evn=*(short *) ((int)current_addr+KSUSSOPC);
p1r=*(long *) ((int)current_addr+KSUSSP1R);
p2r=*(long *) ((int)current_addr+KSUSSP2R);
p3r=*(long *) ((int)current_addr+KSUSSP3R);
if ( evn != 0 ) {
printf("%4d %8u %-20.20s %10X %10X %10X n",
i, seq, event[evn] ,p1r, p2r,p3r
);
}
current_addr=(void *)((int)current_addr+RECORD_SZ);
}
}
}
88. Pitfalls
Byte Swapping
32 bit vs 64 bit
Multiple Shared Memory Segments
Segmented Memory
Addresses are "unsigned int"
Misaligned Access
89. Little Endian vs Big Endian
Is low byte values first or high byte values first ?
a byte is 8 bits
– 00000000-11111111 bits,0 – 31 dec, 0x0 - 0xFF hex
Big Endian is "normal" , highest bit first
In ascii, the word "byte" is stored as
– b = 62, y = 79, t = 74, e = 65
echo 'byte' | od -x
– b y t e
– 62 79 74 65
Little Endian, ie byte swapped (Linux, OSF, Sequent, ? )
– y b e t
– 79 62 65 74
90. Byte Swap Example
Short = 2 bytes ie 16 bits
Goal, get the flag in the "second" byte
#ifdef __linux
uflg=*(short *)((int)sga_address)>>8;
#else
uflg=*(short *)((int)sga_address);
#endif
91. Byte Swap
Big Endian:
00 00 00 00 00 00 00 01
Little Endian
00 00 00 01 00 00 00 00
Solution, push the value over 8 places, to the
right,
ie >>8
92. 64 bit vs 32 bit
SQL> desc x$ksmmem
Name Type
------------------------------------- ---------
ADDR RAW(4)
INDX NUMBER
INST_ID NUMBER
KSMMMVAL RAW(4)
-> 32 bit
Raw(8) -> 64 bit
93. Segmented Memory
x$ksuse – can be dis-contiguous
Work around:
select 'int users[]={' from dual;
select '0x'||addr||',' from x$ksuse;
select '0x0};' from dual;
94. Misaligned Access
Some platforms seg fault when addressing
misaligned bytes, need to read in even bytes or
units of 4 bytes depending on platform
1 2 3 4 5 6 7 8
97. x$ksuse Record Contains
x$ksusecst
One Record in X$Ksusecst
v$session v$sesstat v$session_wait v$session
236 1276
2328 bytes
x$ksusesta x$ksusecst
x$ksuse
98. Getting v$sesstat addresses
select '#define '||
upper(translate(s.name,' :-()/*''','________'))||' '||
to_char(c.kqfcooff + STATISTIC# * 4 )
from
x$kqfco c,
x$kqfta t,
v$statname s
where
t.indx = c.kqfcotab
and ( t.kqftanam='X$KSUSESTA' ) and c.kqfconam='KSUSESTV'
and kqfcooff > 0
order by
c.kqfcooff
/
99. User Drilldown Query: 4 joins
select
w.sid sid,
w.seq# seq,
w.event event,
w.p1raw p1,
w.p2raw p2,
w.p3raw p3,
w.SECONDS_IN_WAIT ctime,
s.sql_hash_value sqlhash,
s.prev_hash_value psqlhash,
st.value cpu
from
v$session s,
v$sesstat st,
v$statname sn,
v$session_wait w
where
w.sid = s.sid and
st.sid = s.sid and
st.statistic# = sn.statistic# and
sn.name = 'CPU used when call started' and
w.event != 'SQL*Net message from client'
order by w.sid;
100. Other Fun Stuff
The next example is output from an SGA
program that follows the LRU of the Buffer
Cache
The program demonstrates the
• insertion point of LRU
• cold end of LRU
• hot end of the LRU
• Full Table Scan Insertion Point
Notice that column A and C are missing from x$foo. Not all values in a structure used in the SGA are made visible via SQL
If we mapped the structure on memory or dumped it in a file, we could find the different elements
If we mapped the structure on memory or dumped it in a file, we could find the different elements
Oracle doesn’t always expose all the fields in the structure thus if there are gaps in the offsets that are bigger than the field sizes then there is other information in the underlying structure that isn’t exposed in the X$ table. (in this case those address are exposed but in different X$ tables)