SlideShare una empresa de Scribd logo
1 de 14
1
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
Oracle State Objects and
System State Dumps
Hacking Session with Tanel Põder
https://blog.tanelpoder.com
@tanelpoder
2
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• This is a free hacking session, not a formal training session
• Only a few slides
• Not much rehearsed or planned
• Lots of live hacking fun (hopefully!)
• Training info at blog.tanelpoder.com/seminar
• Feb 2020: Advanced Oracle Troubleshooting
• May or June 2020: Advanced Oracle SQL Tuning
• All attendees get downloadable videos (upfront if needed)
• Latest scripts in GitHub
• https://github.com/tanelpoder/tpt-oracle
• https://github.com/tanelpoder/tpt-oracle/blob/master/tools/unix/ssexplorer.sh
About
Please star my TPT
repo if you use it :-)
There's a @help.sql script now
(1st step towards having actual
documentation )
3
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Let's explain the why first, before going to the what and how
Why do state objects exist?
4
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Every operation done on a shared object needs to leave a trace in shared memory
• Can be used in case of "rollback" of that operation
• Especially important for cleaning up after dead processes
State object trees
process SO
session SO
call SO
transaction SO
library object
lock SO
enqueue SO
PMON
process SOprocess SOprocess SO process SO
library cache
object handle
x$kglob (v$sql)
x$ksqrs (v$resource)
x$ktcxb (v$transaction)
PMON checks if the
SPID stored in this
process SO still exists
x$ksuse (v$session)
x$ksupr (v$process)
Number of state objects
(slots in v$process array)
is controlled by
processes parameter
pointer to first child SO
5
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Demo
State object structure
6
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Processstate dump, systemstate dump
• Processstate dump dumps all state objects under a process
• Systemstate dump runs processstate dump for all processes (heavy operation, large trace file)
• Can be used for determining leaks and hangs in extreme circumstances
• alter session set events
'immediate trace name processstate level 266'
• oradebug dump –g all dump systemstate 258
• alter session set events
'immediate trace name systemstate level 266'
• alter session set events
'60 trace name systemstate level 10'
Textual dumping of state object trees
Level 266 (10 + 256) to get
processstate dump at level 10
+ short_stack stack traces
dumped for all processes
This would make a session
automatically take a
systemstate dump when it
hits "ORA-60: deadlock
detected" error
In RAC you probably want to
use level 258 (256 + 2) to
avoid dumping lots of lock
elements
More info about state object dump levels:
https://grepora.com/2017/01/04/systemstate-dump/
7
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
===================================================
PROCESS STATE
-------------
Process global information:
process: 28A4D1EC, call: 28B5FBFC, xact: 00000000, curses: 28B36EAC, usrses: 28B36EAC
----------------------------------------
SO: 28A4D1EC, type: 2, owner: 00000000, flag: INIT/-/-/0x00 <-- SO address
(process) Oracle pid=17, calls cur/top: 28B5FBFC/28B5FBFC, flag: (0) <-- SO type (process)
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 28A7F368
O/S info: user: SYSTEM, term: PORGAND, ospid: 3740
OSD pid info: Windows thread id: 3740, image: ORACLE.EXE (SHAD)
Dump of memory from 0x28A3A368 to 0x28A3A4EC
28A3A360 00000005 27CF0130 [....0..']
28A3A370 00000010 0003139D 28B5FBFC 00000003 [...........(....]
28A3A380 0003139D 280AEC60 0000000B 0003139D [....`..(........]
Repeat 21 times
28A3A4E0 00000000 00000000 00000000 [............]
----------------------------------------
SO: 28B36EAC, type: 4, owner: 28A4D1EC, flag: INIT/-/-/0x00 <-- Child SO (indented)
(session) sid: 152 trans: 00000000, creator: 28A4D1EC, flag: (41) USR/- BSY/-/-/-/-/-
DID: 0001-0011-00000162, short-term DID: 0000-0000-00000000
Manual reading of a process/system state dump – structure and organization
Its easy to search for held
resources by searching for
your problem objects
address in the dump.
For example if you see that
your problem sessions are
hung waiting to get a lock on
library cache object at
address X, it makes sense
to search for that address X
to see who else is holding a
lock on that object
8
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Oracle: ass.awk
• Was in MOS
• Part of LTOM
• As from your friendly support guy
• Or search in Google:
• https://www.cnblogs.com/lYng/p/9436244.html
• Use (scripts downloaded from forums) at your own risk!
• Tanel: ssexplorer.sh
• HTML-izes system state dumps
Tools for analyzing system state dumps
9
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
Low(er) level research – recursive sessions
...
----------------------------------------
SO: 0x30f3c504, type: 3, owner: 0x30e23050, flag: INIT/-/-/0x00
(call) sess: cur 30f13638, rec 30f0bf28, usr 30f13638; depth: 0
----------------------------------------
SO: 0x30f0bf28, type: 4, owner: 0x30f3c504, flag: INIT/-/-/0x00
(session) sid: 144 trans: (nil), creator: (nil), flag: (2) -/REC -/-/-/-/-/-
DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
txn branch: (nil)
oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
temporary object counter: 0
----------------------------------------
SO: 0x2eaa6de8, type: 53, owner: 0x30f0bf28, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=2eaa6de8 handle=2d7e4a70 mode=N
call pin=0x2ea5b1d0 session pin=(nil) hpc=0000 hlc=0000
htl=0x2eaa6e34[0x2ea44f30,0x2ea92690] htb=0x2ea92690 ssga=0x2ea91fb4
user=30f13638 session=30f0bf28 count=1 flags=[0000] savepoint=0xdb8
A separate, recursive
session under the user
session's call SO
Recursive sessions are
used for data dictionary
queries (SELECT and
DML) executed as SYS
Read my old blog entry to learn about recursive sessions:
http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive-
sessions-and-ora-00018-maximum-number-of-sessions-exceeded
Recursive sessions are
different from regular
recursive calls
(separation provided via
call state object typically)
10
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
Low(er) level research – nested multi-level wait events
Current Wait Stack:
1: waiting for 'KSV master wait'
=0, =0, =0
wait_id=410 seq_num=602 snap_id=2
wait times: snap=26 min 47 sec, exc=36 min 48 sec, total=36 min 49 sec
wait times: max=infinite
wait counts: calls=739 os=739
in_wait=1 iflags=0x15a0
0: waiting for 'ASM file metadata operation'
msgop=11, locn=0, =0
wait_id=408 seq_num=599 snap_id=2
wait times: snap=0.000000 sec, exc=0.000019 sec, total=36 min 52 sec
wait times: max=infinite
wait counts: calls=0 os=0
in_wait=1 iflags=0x1520
Current Wait Stack:
1: waiting for 'CSS operation: data query'
function_id=0x4, =0x0, =0x0
wait_id=766629 seq_num=39925 snap_id=1
wait times: snap=0.000315 sec, exc=0.000315 sec, total=0.000315 sec
wait times: max=infinite, heur=0.155743 sec
wait counts: calls=0 os=0
in_wait=1 iflags=0x520
0: waiting for 'ASM file metadata operation'
msgop=0x0, locn=0xb, =0x0
wait_id=186922 seq_num=39924 snap_id=55419
wait times: snap=0.000000 sec, exc=8.818839 sec, total=380 min 53 sec
wait times: max=infinite, heur=380 min 53 sec
wait counts: calls=0 os=0
in_wait=1 iflags=0x15a0
One higher level wait
temporarily switches into
another operation (at
lower level) and
temporarily waits for that
SQL Trace traces only
the top level wait event!
See my Oracle Wait Event internals background
process communication hacking session:
https://www.youtube.com/watch?v=mkmvZv58W6w
11
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
Latchless dumping
----------------------------------------
SO: 3a8729f90, type: 3, owner: 3ab9396b8, flag: INIT/-/-/0x00
(call) sess: cur 3aaddd538, rec 0, usr 3aaddd538; depth: 0
----------------------------------------
SO: 3aab76570, type: 24, owner: 3a8729f90, flag: INIT/-/-/0x00
Aborting this subtree dump because of state inconsistency
----------------------------------------
SO: 3a7268bb8, type: 16, owner: 3ab9396b8, flag: INIT/-/-/0x00
(osp req holder)
REDO: 0x0 SINGLE / -- / --
itl: 2, sno: 131, row size 28
insert key: (24):
06 68 69 6e 74 6f 6e 09 63 6c 65 76 65 6c 61 6e 64 06 00 4e f8 c5 00 07
------------------------------------------------------
------------------------------------------------------
IMU Undo change vector list (latched dump)
------------------------------------------------------
umap: 0xccc5b1d8 uba: 0x01013e30.1ad4.42
undobh 0x3fbef4c70 cv 0xccc5b060 rcvi 0 Not applied
------------------------------------------------------
ktudb redo: siz: 112 spc: 1028 flg: 0x0012 seq: 0x1ad4 rec: 0x42
xid: 0x0017.011.00011560
ktubl redo: slt: 17 rci: 0 opc: 11.1 [objn: 75287 objd: 75287 tsn: 7]
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Systemstate dumps &
hanganalyze attempt to
read state objects
without taking latches
I have hit bugs in past
where systemstate
dump itself was hung
trying to get a held latch
Apparently some
"latched dumps" are still
used (but I hope it just
tries once with an
immediate (nowait) latch
get and moves on
12
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
Other details (if have time)
10065, 00000, "limit library cache dump information for state object
dump"
// *Document: NO
// *Cause:
// *Action: level 1 - minimal (only the address of state objects)
// level 2 - little more (no object details)
// level 3 - normal
10809, 00000, "Trace state object allocate / free history"
// *Document: NO
// *Cause:
// *Action: Set this event only under the supervision of Oracle development
// *Comment: This event will trace the history of KSS allocations / deletions.
// level: 0 = disabled, 1 = cleanup only, 2 = always
From Julian Dyke's OracleDiagnostics.ppt:
Level 1: Address of library object only
Level 2: As 1 plus library object lock details
Level 3: As 2 plus library object handle and
library object
In my testing on 18.3
this only traced state
object deletions/releases
and no allocations/gets.
Could trace kss.*
function calls instead
13
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Tanel’s stuff
• http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive-sessions-and-ora-00018-maximum-number-of-
sessions-exceeded
• MOS Notes
• Reading and Understanding Systemstate Dumps (Doc ID 423153.1)
• Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash
instances (Doc ID 11800959.8)
• Julian Dyke’s internals diagrams (SGA data structures etc)
• http://www.juliandyke.com/Presentations/Presentations.php
• Frits Hoogland’s Oracle function name collection
• http://orafun.info
• http://orafun.info/stack
• https://gitlab.com/FritsHoogland/ora_functions
Additional Reading
14
© Tanel Poder 2020
blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session
• Next hacking session in the end of Jan 2020 (TBA)
• Check http://blog.tanelpoder.com
• Follow https://twitter.com/TanelPoder
Thanks!

Más contenido relacionado

Más de Tanel Poder

Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and OracleTanel Poder
 
Oracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known FeaturesOracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known FeaturesTanel Poder
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionTanel Poder
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder
 
Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingTanel Poder
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningOracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningTanel Poder
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder
 

Más de Tanel Poder (9)

Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and Oracle
 
Oracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known FeaturesOracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known Features
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention Troubleshooting
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningOracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata Migrations
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Oracle State Objects and System State Dumps

  • 1. 1 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Oracle State Objects and System State Dumps Hacking Session with Tanel Põder https://blog.tanelpoder.com @tanelpoder
  • 2. 2 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • This is a free hacking session, not a formal training session • Only a few slides • Not much rehearsed or planned • Lots of live hacking fun (hopefully!) • Training info at blog.tanelpoder.com/seminar • Feb 2020: Advanced Oracle Troubleshooting • May or June 2020: Advanced Oracle SQL Tuning • All attendees get downloadable videos (upfront if needed) • Latest scripts in GitHub • https://github.com/tanelpoder/tpt-oracle • https://github.com/tanelpoder/tpt-oracle/blob/master/tools/unix/ssexplorer.sh About Please star my TPT repo if you use it :-) There's a @help.sql script now (1st step towards having actual documentation )
  • 3. 3 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Let's explain the why first, before going to the what and how Why do state objects exist?
  • 4. 4 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Every operation done on a shared object needs to leave a trace in shared memory • Can be used in case of "rollback" of that operation • Especially important for cleaning up after dead processes State object trees process SO session SO call SO transaction SO library object lock SO enqueue SO PMON process SOprocess SOprocess SO process SO library cache object handle x$kglob (v$sql) x$ksqrs (v$resource) x$ktcxb (v$transaction) PMON checks if the SPID stored in this process SO still exists x$ksuse (v$session) x$ksupr (v$process) Number of state objects (slots in v$process array) is controlled by processes parameter pointer to first child SO
  • 5. 5 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Demo State object structure
  • 6. 6 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Processstate dump, systemstate dump • Processstate dump dumps all state objects under a process • Systemstate dump runs processstate dump for all processes (heavy operation, large trace file) • Can be used for determining leaks and hangs in extreme circumstances • alter session set events 'immediate trace name processstate level 266' • oradebug dump –g all dump systemstate 258 • alter session set events 'immediate trace name systemstate level 266' • alter session set events '60 trace name systemstate level 10' Textual dumping of state object trees Level 266 (10 + 256) to get processstate dump at level 10 + short_stack stack traces dumped for all processes This would make a session automatically take a systemstate dump when it hits "ORA-60: deadlock detected" error In RAC you probably want to use level 258 (256 + 2) to avoid dumping lots of lock elements More info about state object dump levels: https://grepora.com/2017/01/04/systemstate-dump/
  • 7. 7 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session =================================================== PROCESS STATE ------------- Process global information: process: 28A4D1EC, call: 28B5FBFC, xact: 00000000, curses: 28B36EAC, usrses: 28B36EAC ---------------------------------------- SO: 28A4D1EC, type: 2, owner: 00000000, flag: INIT/-/-/0x00 <-- SO address (process) Oracle pid=17, calls cur/top: 28B5FBFC/28B5FBFC, flag: (0) <-- SO type (process) int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 28A7F368 O/S info: user: SYSTEM, term: PORGAND, ospid: 3740 OSD pid info: Windows thread id: 3740, image: ORACLE.EXE (SHAD) Dump of memory from 0x28A3A368 to 0x28A3A4EC 28A3A360 00000005 27CF0130 [....0..'] 28A3A370 00000010 0003139D 28B5FBFC 00000003 [...........(....] 28A3A380 0003139D 280AEC60 0000000B 0003139D [....`..(........] Repeat 21 times 28A3A4E0 00000000 00000000 00000000 [............] ---------------------------------------- SO: 28B36EAC, type: 4, owner: 28A4D1EC, flag: INIT/-/-/0x00 <-- Child SO (indented) (session) sid: 152 trans: 00000000, creator: 28A4D1EC, flag: (41) USR/- BSY/-/-/-/-/- DID: 0001-0011-00000162, short-term DID: 0000-0000-00000000 Manual reading of a process/system state dump – structure and organization Its easy to search for held resources by searching for your problem objects address in the dump. For example if you see that your problem sessions are hung waiting to get a lock on library cache object at address X, it makes sense to search for that address X to see who else is holding a lock on that object
  • 8. 8 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Oracle: ass.awk • Was in MOS • Part of LTOM • As from your friendly support guy • Or search in Google: • https://www.cnblogs.com/lYng/p/9436244.html • Use (scripts downloaded from forums) at your own risk! • Tanel: ssexplorer.sh • HTML-izes system state dumps Tools for analyzing system state dumps
  • 9. 9 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Low(er) level research – recursive sessions ... ---------------------------------------- SO: 0x30f3c504, type: 3, owner: 0x30e23050, flag: INIT/-/-/0x00 (call) sess: cur 30f13638, rec 30f0bf28, usr 30f13638; depth: 0 ---------------------------------------- SO: 0x30f0bf28, type: 4, owner: 0x30f3c504, flag: INIT/-/-/0x00 (session) sid: 144 trans: (nil), creator: (nil), flag: (2) -/REC -/-/-/-/-/- DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000 txn branch: (nil) oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS temporary object counter: 0 ---------------------------------------- SO: 0x2eaa6de8, type: 53, owner: 0x30f0bf28, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=2eaa6de8 handle=2d7e4a70 mode=N call pin=0x2ea5b1d0 session pin=(nil) hpc=0000 hlc=0000 htl=0x2eaa6e34[0x2ea44f30,0x2ea92690] htb=0x2ea92690 ssga=0x2ea91fb4 user=30f13638 session=30f0bf28 count=1 flags=[0000] savepoint=0xdb8 A separate, recursive session under the user session's call SO Recursive sessions are used for data dictionary queries (SELECT and DML) executed as SYS Read my old blog entry to learn about recursive sessions: http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive- sessions-and-ora-00018-maximum-number-of-sessions-exceeded Recursive sessions are different from regular recursive calls (separation provided via call state object typically)
  • 10. 10 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Low(er) level research – nested multi-level wait events Current Wait Stack: 1: waiting for 'KSV master wait' =0, =0, =0 wait_id=410 seq_num=602 snap_id=2 wait times: snap=26 min 47 sec, exc=36 min 48 sec, total=36 min 49 sec wait times: max=infinite wait counts: calls=739 os=739 in_wait=1 iflags=0x15a0 0: waiting for 'ASM file metadata operation' msgop=11, locn=0, =0 wait_id=408 seq_num=599 snap_id=2 wait times: snap=0.000000 sec, exc=0.000019 sec, total=36 min 52 sec wait times: max=infinite wait counts: calls=0 os=0 in_wait=1 iflags=0x1520 Current Wait Stack: 1: waiting for 'CSS operation: data query' function_id=0x4, =0x0, =0x0 wait_id=766629 seq_num=39925 snap_id=1 wait times: snap=0.000315 sec, exc=0.000315 sec, total=0.000315 sec wait times: max=infinite, heur=0.155743 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x520 0: waiting for 'ASM file metadata operation' msgop=0x0, locn=0xb, =0x0 wait_id=186922 seq_num=39924 snap_id=55419 wait times: snap=0.000000 sec, exc=8.818839 sec, total=380 min 53 sec wait times: max=infinite, heur=380 min 53 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x15a0 One higher level wait temporarily switches into another operation (at lower level) and temporarily waits for that SQL Trace traces only the top level wait event! See my Oracle Wait Event internals background process communication hacking session: https://www.youtube.com/watch?v=mkmvZv58W6w
  • 11. 11 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Latchless dumping ---------------------------------------- SO: 3a8729f90, type: 3, owner: 3ab9396b8, flag: INIT/-/-/0x00 (call) sess: cur 3aaddd538, rec 0, usr 3aaddd538; depth: 0 ---------------------------------------- SO: 3aab76570, type: 24, owner: 3a8729f90, flag: INIT/-/-/0x00 Aborting this subtree dump because of state inconsistency ---------------------------------------- SO: 3a7268bb8, type: 16, owner: 3ab9396b8, flag: INIT/-/-/0x00 (osp req holder) REDO: 0x0 SINGLE / -- / -- itl: 2, sno: 131, row size 28 insert key: (24): 06 68 69 6e 74 6f 6e 09 63 6c 65 76 65 6c 61 6e 64 06 00 4e f8 c5 00 07 ------------------------------------------------------ ------------------------------------------------------ IMU Undo change vector list (latched dump) ------------------------------------------------------ umap: 0xccc5b1d8 uba: 0x01013e30.1ad4.42 undobh 0x3fbef4c70 cv 0xccc5b060 rcvi 0 Not applied ------------------------------------------------------ ktudb redo: siz: 112 spc: 1028 flg: 0x0012 seq: 0x1ad4 rec: 0x42 xid: 0x0017.011.00011560 ktubl redo: slt: 17 rci: 0 opc: 11.1 [objn: 75287 objd: 75287 tsn: 7] Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Systemstate dumps & hanganalyze attempt to read state objects without taking latches I have hit bugs in past where systemstate dump itself was hung trying to get a held latch Apparently some "latched dumps" are still used (but I hope it just tries once with an immediate (nowait) latch get and moves on
  • 12. 12 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Other details (if have time) 10065, 00000, "limit library cache dump information for state object dump" // *Document: NO // *Cause: // *Action: level 1 - minimal (only the address of state objects) // level 2 - little more (no object details) // level 3 - normal 10809, 00000, "Trace state object allocate / free history" // *Document: NO // *Cause: // *Action: Set this event only under the supervision of Oracle development // *Comment: This event will trace the history of KSS allocations / deletions. // level: 0 = disabled, 1 = cleanup only, 2 = always From Julian Dyke's OracleDiagnostics.ppt: Level 1: Address of library object only Level 2: As 1 plus library object lock details Level 3: As 2 plus library object handle and library object In my testing on 18.3 this only traced state object deletions/releases and no allocations/gets. Could trace kss.* function calls instead
  • 13. 13 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Tanel’s stuff • http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive-sessions-and-ora-00018-maximum-number-of- sessions-exceeded • MOS Notes • Reading and Understanding Systemstate Dumps (Doc ID 423153.1) • Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances (Doc ID 11800959.8) • Julian Dyke’s internals diagrams (SGA data structures etc) • http://www.juliandyke.com/Presentations/Presentations.php • Frits Hoogland’s Oracle function name collection • http://orafun.info • http://orafun.info/stack • https://gitlab.com/FritsHoogland/ora_functions Additional Reading
  • 14. 14 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Next hacking session in the end of Jan 2020 (TBA) • Check http://blog.tanelpoder.com • Follow https://twitter.com/TanelPoder Thanks!