SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Solaris Kernel Debugging
Mdb and DTrace
Oliver Yang
Software Engineer
Sun Mircosystem, Inc.

                           1
Agenda
•   Kernel Debug Overview
•   Modular Debugger - Mdb
•   Dynamic Tracing - DTrace
•   References




                               2
Skill Sets of Kernel Debugging
• Key elements for kernel debugging
  > Kernel source code
     – http://src.opensolaris.org/source/xref/onnv/o
       nnv-gate/usr/src/
  > Kernel debugging tools
  > System Architecture
     – x32/x64/SPARC
  > Programing skills
     – C/Assembly/D/Shell/Awk/Sed/Perl



                                                       3
Kernel Debugging Tools
• Debug In Code
   > cmn_err(9F) - Kernel version of printf(3C)
   > ASSERT - Only effective in debug kernel
• In-situ kernel debuggers
   > Kmdb, SPARC OBP
• Run time tracing
   > DTrace, Lockstat, Kmem allocator...etc.
• Post-mortem debuggers
   > Mdb, ACT, SCAT


                                                  4
Difficulties of Kernel Debugging...
• The problems you may encounter
  > System Panic
  > System hang
  > Memory leaks & corruption
  > Performance issues
  > Any other functionality issues
• Some of hot bugs found on customer sites...
  > Can not debug on the non-production kernel
  > Can not debug on mission-critical machines
  > May not be deterministically reproduced
  > May only have the crash dumps
                                                 5
Agenda
•   Kernel Debug Overview
•   Modular Debugger - Mdb
•   Dynamic Tracing - DTrace
•   References




                               6
Mdb - The Modular Debugger
• Mdb targets
 >   User processes
 >   User process core files
 >   Live kernel read only by /dev/kmem&/dev/ksyms
 >   Live Kernel with execution control by kmdb
 >   System crash dumps
 >   User process images inside system crash dumps
 >   ELF object files
 >   Raw data files


                                                     7
Live Kernel Debug – Read Only
• How to run it?
 > mdb -k
• What you can do?
 > Inspect kernel data structures and kernel pages
 > /dev/kmem
    Access kernel virtual address space excluding memory
    that is associated with an I/O device
 > /dev/ksyms
    Access kernel symbols as kernel ELF definitions



                                                           8
Live Kernel Debug - Execution Control
• How to run it?
  > mdb -K
  > Boot system with kmdb loaded
     – x86 “-k”option in grub menu
     – SPARC “-k or kmdb” option in OBP
• What you can do?
  > Instruction-level control of kernel threads
    executing on each CPU
  > Setting breakpoint and single-step the kernel
    and inspect data structures in real time


                                                    9
Live Kernel Debug - Execution Control
• dcmds
  >   [addr]:b
  >   [addr]:d
  >   ::events or $b
  >   :z
  >   :c
  >   :e
  >   :s
  >   [syscall]::sysbp
  >   addr [,len]::wp

                                    10
Post-mortem Debug - Crash Dumps
• How to use it
 > mdb unix.<n> vmcore.<n>
• What you can do?
 > Access kernel memory pages and user process
   images inside a system crash dump
 > Inspect kernel/user process data structures
   and kernel/user process pages




                                                 11
Post-mortem Debug - Crash Dumps
• You can get a crash dump by...
 >   A real panic
 >   Reboot with -d
 >   Enter kmdb, run $<systemdump
 >   Deadman timer
     – Setting snooping to 1 in /etc/system, reboot
     – Setting deadman_enabled to 1 via mdb -kw
• savecore(1M) & dumpadm(1M)
     Dump content: kernel pages
     Dump device: /dev/dsk/c0d0s1 (swap)
     Savecore directory: /var/crash/<hostname>
     Savecore enabled: yes
                                                      12
Modular Debugger Basic
• General Dcmds
 >   ::help
 >   ::dcmds
 >   ::formats
 >   ::dmods -l [module...]
 >   ::log -e file
 >   ::quit or $q




                              13
Modular Debugger Basic
• Inspect memory and data structures
 >   addr[,b]::dump [-g sz] [-e]
 >   addr::dis
 >   addr::print type field
 >   ::sizeof type
 >   ::offsetof type field
 >   ::enum enumname
 >   addr::array [type count] [var]
 >   addr::list type field [var]


                                       14
Crash Dumps Analysis - Panic
• Panic procedures
 > Panic messages
    – Panic thread
    – Trap number
    – Pointer of trap frame
    – CPU registers
    – back trace
 > Dump memory to dump device
 > Dump CPU registers to dump device
 > Reboot
 > Savecore (from dump device to file system)

                                                15
Crash Dumps Analysis - Panic
• dcmds
 >   ::satus
 >   ::showrev
 >   ::prtconf
 >   ::modinfo
 >   ::msgbuf
 >   [addr]$c/::stack/::stackregs
 >   [addr]::dis
 >   ::regs
 >   [rp]::print struct regs
• Know the ABIs of x32/x64/SPARC
                                    16
Crash Dumps Analysis – Hang
• What conditions cause hangs?
 > Deadlock
 > Resources exhaustion
 > Hardware problems
• Debugging system hangs
 > Live debugging with kmdb
 > Forcing a crash dump and analysis with mdb




                                                17
Crash Dumps Analysis – Hang
• Dispatcher and kernel threads
 >   [id]::cpuinfo
 >   ::cycinfo
 >   [addr]::threadlist
 >   [addr]::thread
 >   [addr]::findstack
 >   [addr]::mutex
 >   [addr]::rwlock
 >   [addr]::wchaninfo
 >   [addr]::whatthread or ::kgrep

                                     18
Crash Dumps Analysis – Hang
• Kernel Memory
 >   ::memstat
 >   ::findleaks
 >   ::kmastat/::kmem_cache/::walk <cache name>
 >   ::kmausers
 >   ::vmem/::walk vmem_seg/::vmem_seg
 >   [addr]::whatis
 >   [addr]::bufctl
 >   [addr]::allocdby/[addr]::freedby
• Some of dcmds need kmem allocator tracing
 > Setting kmem_flags = 0xf in /etc/system, reboot
                                                     19
Agenda
•   Kernel Debug Overview
•   Modular Debugger - Mdb
•   Dynamic Tracing - DTrace
•   References




                               20
Dynamic Tracing Framework
• DTrace framework includes...
 > Consumer programs running in user land
    – dtrace(1M)/intrstat(1M)/lockstat(1M)...
 > Kernel modules that provide probes to gather
   tracing data
    – dtrace(7D) and providers: syscall/fbt/sdt/vminfo...
 > A library interface that consumer programs use
   to access the DTrace facility by dtrace driver




                                                            21
DTrace Big Picture




                     22
Provider
• How provider works
 > Provider represents a methodology for
   instrumenting the system
 > Provider covers a certain aspect of the system
 > Provider makes probes available to the DTrace
   framework
 > DTrace informs providers when a probe is to be
   enabled provider transfers
• Using providers with different ways
 > Watch code path
   – fbt/sdt/syscall/pid/fsinfo/io/vminfo/proc/sched, etc.
 > Get statistical data
   – mib/lockstat/profile/sysinfo, etc.
                                                             23
Providers
 Provider                           Description
lockstat lock contention statistics or understand locking behaviors
profile a time-based interrupt firing every fixed, specified interval
fbt       entry to and return from most functions in the Solaris kernel
syscall entry to and return from every system call in the system
sdt       locations at that a programmer has formally designated
sysinfo correspond to kernel statistics classified by the name sys
vminfo correspond to the vm kernel statistics
proc      process creation and termination,sending and handling signals
sched     related to CPU scheduling
io        related to disk input and output
mib       related to counters in MIB - management information bases
pid       entry and return of any function in a user process


                                                                          24
Running DTrace
• D scripts
  > Run *.d scripts
     #!/usr/sbin/dtrace -s
     probe
     /predicate/
     {
          actions
     }
• Command line
  > Run dtrace command, see dtrace(1M)
     dtrace -n probe'/predicate/{actions}'

                                             25
Probe
• provider:module:function:name
 > Provider
    – The instrumentation method to be used.For example,
      the syscall provider is used to monitor system calls
      while the io provider is used to monitor the disk io.
 > Module
    – The kernel module you want to observe
 > Function
    – The kernel function you want to observe
 > Name
    – Represents the location in the function. For example,
      use entry for name to instrument when you enter the
      function.

                                                              26
Probe
• A probe...
  > Is defined as 4-attribute tuple
  > could be listed by dtrace -l [-f|-l|-m|-n|-P]
  > supports wildcards match
Probe Description                       Explanation
fbt::bge_intr:entry entry into bge_intr functions
fbt::bge_*:entry    entry into any kernel functions that starts with bge_
fbt:bge::entry      entry into any bge driver functions
fbt:::entry         entry into any kernel functions
fbt:::              all probes published by the fbt provider

                                                                        27
Predicate
• A predicate...
  > could be any D expression, result is boolean
  > is true means the actions could be executed

Predicate                              Explanation
CPU == 0            true if the probe executes on cpu0
                    true if the pid of the process that caused
Pid == 1029
                    the probe to fire is 1029
execname != “sched” true if the process is not the scheduler
                        true if the parent process id is not 0 and
ppid !=0 && arg0 == 0
                        first argument is 0



                                                                     28
Action
• An Action...
  > is executed when a probe fires
  > has two categories
     – Data Recording Action/Destructive Action
 Action                           Explanation
 trace()        trace the D expression results
 printf()       print something using C-style printf()
 printa()       print the aggregations
 ustack()       print the user stack trace
 stack()        print the kernel stack trace
 tracemem()     copy data from an address in memory to a buffer
 breakpoint()   a kernel breakpoint, causes system drop into kmdb
 panic()        cause a kernel panic
 chill()        spin for the specified number of nanoseconds
                                                                    29
Aggregation
• Aggregation syntax
 > @name[ keys ] = aggfunc( args );

 Functions                       Explanation
 count()     times that the count function is called
 sum()       total value of the specified expressions
 avg()       arithmetic average of the specified expressions
 min()       smallest value among the specified expressions
 max()       largest value among the specified expressions
             A linear frequency distribution of the values of the
 lquantize() specified expressions that is sized by the specified
             range
             A power of 2 frequency distribution of the values
 quantize()
             of the specified expressions.

                                                                    30
Variables
 > Scalar Variables
    – Represent individual fixed-size data objects
 > Associative Arrays
    – name [ key ] = expression ;
 > Thread-Local Variables
    – self->[variable name]
 > Clause-Local Variables
    – this->[variable name]
 > Built-in Variables
    – pre-defined scalar global variables
 > External Variables
    – the ”`” is a scoping operator for accessing variables
      that are defined in the OS, eg: `kmem_flags
                                                              31
Built-in Variables
 Type and Name                           Explanation
int64_t arg0...arg9 The first 10 input arguments
cpuinfo_t *curcpu    The CPU information for the current CPU.
processorid_t cpu    The CPU identifier for the current CPU.
kthread_t *curthread kthread_t address for current kernel thread
pid_t pid            The process ID of the current process
pid_t ppid           parent process ID of the current process
uint_t ipl           IPL on the current CPU at probe firing time
int errno            Error value returned by the last system call
string execname      name passed to exec(2) to execute the process
                   A nanosecond timestamp counter, it increments
uint64_t timestamp from an arbitrary point in the past and should only
                   be used for relative computations
                    A nanosecond timestamp counter that is the time
uint64_t vtimestamp of the current thread has been running on a CPU,
                    minus the time spent in predicates and actions
                                                                         32
Agenda
•   Kernel Debug Overview
•   Modular Debugger - Mdb
•   Dynamic Tracing - DTrace
•   References




                               33
Documentations & Links - Mdb
• Solaris Internals Second Edition
 > www.solarisinternals.com
• Solaris Modular Debugger Guide
 > docs.sun.com/app/docs/doc/817-2543
• OpenSolaris mdb community
 > opensolaris.org/os/community/mdb
• Crash Dump analysis
 > opensolaris.org/os/community/documentation
   /files/book.pdf

                                                34
Documentations & Links - DTrace
• Solaris Internals Second Edition
 > www.solarisinternals.com/wiki/index.php/DTr
   ace_Topics
• Solaris Dynamic Tracing Guide
 > docs.sun.com/app/docs/doc/819-3620
• OpenSolaris DTrace community
 > opensolaris.org/os/community/dtrace
• DTrace Tools
 > www.brendangregg.com/dtrace.html


                                                 35
Q&A
Oliver Yang
Oliver.Yang@Sun.COM


                      36

Más contenido relacionado

La actualidad más candente

Android booting sequece and setup and debugging
Android booting sequece and setup and debuggingAndroid booting sequece and setup and debugging
Android booting sequece and setup and debugging
Utkarsh Mankad
 

La actualidad más candente (20)

eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
Ansible
AnsibleAnsible
Ansible
 
Docker containers : introduction
Docker containers : introductionDocker containers : introduction
Docker containers : introduction
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
Effective service and resource management with systemd
Effective service and resource management with systemdEffective service and resource management with systemd
Effective service and resource management with systemd
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
Performance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux KernelPerformance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux Kernel
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 
Introduction to yocto
Introduction to yoctoIntroduction to yocto
Introduction to yocto
 
Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Embedded linux network device driver development
Embedded linux network device driver developmentEmbedded linux network device driver development
Embedded linux network device driver development
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
Understanding kube proxy in ipvs mode
Understanding kube proxy in ipvs modeUnderstanding kube proxy in ipvs mode
Understanding kube proxy in ipvs mode
 
Docker, LinuX Container
Docker, LinuX ContainerDocker, LinuX Container
Docker, LinuX Container
 
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystemThe basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
 
Android booting sequece and setup and debugging
Android booting sequece and setup and debuggingAndroid booting sequece and setup and debugging
Android booting sequece and setup and debugging
 

Destacado (7)

Solaris DTrace, An Introduction
Solaris DTrace, An IntroductionSolaris DTrace, An Introduction
Solaris DTrace, An Introduction
 
A brief history of DTrace
A brief history of DTraceA brief history of DTrace
A brief history of DTrace
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databases
 
DTrace talk at Oracle Open World
DTrace talk at Oracle Open WorldDTrace talk at Oracle Open World
DTrace talk at Oracle Open World
 
DTrace - Miracle Scotland Database Forum
DTrace - Miracle Scotland Database ForumDTrace - Miracle Scotland Database Forum
DTrace - Miracle Scotland Database Forum
 
#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB#lspe Building a Monitoring Framework using DTrace and MongoDB
#lspe Building a Monitoring Framework using DTrace and MongoDB
 
Christo kutrovsky oracle rac solving common scalability problems
Christo kutrovsky   oracle rac solving common scalability problemsChristo kutrovsky   oracle rac solving common scalability problems
Christo kutrovsky oracle rac solving common scalability problems
 

Similar a Solaris Kernel Debugging V1.0

Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
Kan-Ru Chen
 
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprintsAndy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
NCC Group
 
Summit demystifying systemd1
Summit demystifying systemd1Summit demystifying systemd1
Summit demystifying systemd1
Susant Sahani
 
Auditing the Opensource Kernels
Auditing the Opensource KernelsAuditing the Opensource Kernels
Auditing the Opensource Kernels
Silvio Cesare
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
dflexer
 

Similar a Solaris Kernel Debugging V1.0 (20)

A22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle HaileyA22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle Hailey
 
Lecture 6 Kernel Debugging + Ports Development
Lecture 6 Kernel Debugging + Ports DevelopmentLecture 6 Kernel Debugging + Ports Development
Lecture 6 Kernel Debugging + Ports Development
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
 
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprintsAndy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
 
Summit demystifying systemd1
Summit demystifying systemd1Summit demystifying systemd1
Summit demystifying systemd1
 
Auditing the Opensource Kernels
Auditing the Opensource KernelsAuditing the Opensource Kernels
Auditing the Opensource Kernels
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
Basics_of_Kernel_Panic_Hang_and_ Kdump.pdf
Basics_of_Kernel_Panic_Hang_and_ Kdump.pdfBasics_of_Kernel_Panic_Hang_and_ Kdump.pdf
Basics_of_Kernel_Panic_Hang_and_ Kdump.pdf
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
Debug generic process
Debug generic processDebug generic process
Debug generic process
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
Introduction Linux Device Drivers
Introduction Linux Device DriversIntroduction Linux Device Drivers
Introduction Linux Device Drivers
 
Containers with systemd-nspawn
Containers with systemd-nspawnContainers with systemd-nspawn
Containers with systemd-nspawn
 
CNIT 126: 10: Kernel Debugging with WinDbg
CNIT 126: 10: Kernel Debugging with WinDbgCNIT 126: 10: Kernel Debugging with WinDbg
CNIT 126: 10: Kernel Debugging with WinDbg
 
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbgPractical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
 
Découvrir dtrace en ligne de commande.
Découvrir dtrace en ligne de commande.Découvrir dtrace en ligne de commande.
Découvrir dtrace en ligne de commande.
 
SMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgiSMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgi
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Solaris Kernel Debugging V1.0

  • 1. Solaris Kernel Debugging Mdb and DTrace Oliver Yang Software Engineer Sun Mircosystem, Inc. 1
  • 2. Agenda • Kernel Debug Overview • Modular Debugger - Mdb • Dynamic Tracing - DTrace • References 2
  • 3. Skill Sets of Kernel Debugging • Key elements for kernel debugging > Kernel source code – http://src.opensolaris.org/source/xref/onnv/o nnv-gate/usr/src/ > Kernel debugging tools > System Architecture – x32/x64/SPARC > Programing skills – C/Assembly/D/Shell/Awk/Sed/Perl 3
  • 4. Kernel Debugging Tools • Debug In Code > cmn_err(9F) - Kernel version of printf(3C) > ASSERT - Only effective in debug kernel • In-situ kernel debuggers > Kmdb, SPARC OBP • Run time tracing > DTrace, Lockstat, Kmem allocator...etc. • Post-mortem debuggers > Mdb, ACT, SCAT 4
  • 5. Difficulties of Kernel Debugging... • The problems you may encounter > System Panic > System hang > Memory leaks & corruption > Performance issues > Any other functionality issues • Some of hot bugs found on customer sites... > Can not debug on the non-production kernel > Can not debug on mission-critical machines > May not be deterministically reproduced > May only have the crash dumps 5
  • 6. Agenda • Kernel Debug Overview • Modular Debugger - Mdb • Dynamic Tracing - DTrace • References 6
  • 7. Mdb - The Modular Debugger • Mdb targets > User processes > User process core files > Live kernel read only by /dev/kmem&/dev/ksyms > Live Kernel with execution control by kmdb > System crash dumps > User process images inside system crash dumps > ELF object files > Raw data files 7
  • 8. Live Kernel Debug – Read Only • How to run it? > mdb -k • What you can do? > Inspect kernel data structures and kernel pages > /dev/kmem Access kernel virtual address space excluding memory that is associated with an I/O device > /dev/ksyms Access kernel symbols as kernel ELF definitions 8
  • 9. Live Kernel Debug - Execution Control • How to run it? > mdb -K > Boot system with kmdb loaded – x86 “-k”option in grub menu – SPARC “-k or kmdb” option in OBP • What you can do? > Instruction-level control of kernel threads executing on each CPU > Setting breakpoint and single-step the kernel and inspect data structures in real time 9
  • 10. Live Kernel Debug - Execution Control • dcmds > [addr]:b > [addr]:d > ::events or $b > :z > :c > :e > :s > [syscall]::sysbp > addr [,len]::wp 10
  • 11. Post-mortem Debug - Crash Dumps • How to use it > mdb unix.<n> vmcore.<n> • What you can do? > Access kernel memory pages and user process images inside a system crash dump > Inspect kernel/user process data structures and kernel/user process pages 11
  • 12. Post-mortem Debug - Crash Dumps • You can get a crash dump by... > A real panic > Reboot with -d > Enter kmdb, run $<systemdump > Deadman timer – Setting snooping to 1 in /etc/system, reboot – Setting deadman_enabled to 1 via mdb -kw • savecore(1M) & dumpadm(1M) Dump content: kernel pages Dump device: /dev/dsk/c0d0s1 (swap) Savecore directory: /var/crash/<hostname> Savecore enabled: yes 12
  • 13. Modular Debugger Basic • General Dcmds > ::help > ::dcmds > ::formats > ::dmods -l [module...] > ::log -e file > ::quit or $q 13
  • 14. Modular Debugger Basic • Inspect memory and data structures > addr[,b]::dump [-g sz] [-e] > addr::dis > addr::print type field > ::sizeof type > ::offsetof type field > ::enum enumname > addr::array [type count] [var] > addr::list type field [var] 14
  • 15. Crash Dumps Analysis - Panic • Panic procedures > Panic messages – Panic thread – Trap number – Pointer of trap frame – CPU registers – back trace > Dump memory to dump device > Dump CPU registers to dump device > Reboot > Savecore (from dump device to file system) 15
  • 16. Crash Dumps Analysis - Panic • dcmds > ::satus > ::showrev > ::prtconf > ::modinfo > ::msgbuf > [addr]$c/::stack/::stackregs > [addr]::dis > ::regs > [rp]::print struct regs • Know the ABIs of x32/x64/SPARC 16
  • 17. Crash Dumps Analysis – Hang • What conditions cause hangs? > Deadlock > Resources exhaustion > Hardware problems • Debugging system hangs > Live debugging with kmdb > Forcing a crash dump and analysis with mdb 17
  • 18. Crash Dumps Analysis – Hang • Dispatcher and kernel threads > [id]::cpuinfo > ::cycinfo > [addr]::threadlist > [addr]::thread > [addr]::findstack > [addr]::mutex > [addr]::rwlock > [addr]::wchaninfo > [addr]::whatthread or ::kgrep 18
  • 19. Crash Dumps Analysis – Hang • Kernel Memory > ::memstat > ::findleaks > ::kmastat/::kmem_cache/::walk <cache name> > ::kmausers > ::vmem/::walk vmem_seg/::vmem_seg > [addr]::whatis > [addr]::bufctl > [addr]::allocdby/[addr]::freedby • Some of dcmds need kmem allocator tracing > Setting kmem_flags = 0xf in /etc/system, reboot 19
  • 20. Agenda • Kernel Debug Overview • Modular Debugger - Mdb • Dynamic Tracing - DTrace • References 20
  • 21. Dynamic Tracing Framework • DTrace framework includes... > Consumer programs running in user land – dtrace(1M)/intrstat(1M)/lockstat(1M)... > Kernel modules that provide probes to gather tracing data – dtrace(7D) and providers: syscall/fbt/sdt/vminfo... > A library interface that consumer programs use to access the DTrace facility by dtrace driver 21
  • 23. Provider • How provider works > Provider represents a methodology for instrumenting the system > Provider covers a certain aspect of the system > Provider makes probes available to the DTrace framework > DTrace informs providers when a probe is to be enabled provider transfers • Using providers with different ways > Watch code path – fbt/sdt/syscall/pid/fsinfo/io/vminfo/proc/sched, etc. > Get statistical data – mib/lockstat/profile/sysinfo, etc. 23
  • 24. Providers Provider Description lockstat lock contention statistics or understand locking behaviors profile a time-based interrupt firing every fixed, specified interval fbt entry to and return from most functions in the Solaris kernel syscall entry to and return from every system call in the system sdt locations at that a programmer has formally designated sysinfo correspond to kernel statistics classified by the name sys vminfo correspond to the vm kernel statistics proc process creation and termination,sending and handling signals sched related to CPU scheduling io related to disk input and output mib related to counters in MIB - management information bases pid entry and return of any function in a user process 24
  • 25. Running DTrace • D scripts > Run *.d scripts #!/usr/sbin/dtrace -s probe /predicate/ { actions } • Command line > Run dtrace command, see dtrace(1M) dtrace -n probe'/predicate/{actions}' 25
  • 26. Probe • provider:module:function:name > Provider – The instrumentation method to be used.For example, the syscall provider is used to monitor system calls while the io provider is used to monitor the disk io. > Module – The kernel module you want to observe > Function – The kernel function you want to observe > Name – Represents the location in the function. For example, use entry for name to instrument when you enter the function. 26
  • 27. Probe • A probe... > Is defined as 4-attribute tuple > could be listed by dtrace -l [-f|-l|-m|-n|-P] > supports wildcards match Probe Description Explanation fbt::bge_intr:entry entry into bge_intr functions fbt::bge_*:entry entry into any kernel functions that starts with bge_ fbt:bge::entry entry into any bge driver functions fbt:::entry entry into any kernel functions fbt::: all probes published by the fbt provider 27
  • 28. Predicate • A predicate... > could be any D expression, result is boolean > is true means the actions could be executed Predicate Explanation CPU == 0 true if the probe executes on cpu0 true if the pid of the process that caused Pid == 1029 the probe to fire is 1029 execname != “sched” true if the process is not the scheduler true if the parent process id is not 0 and ppid !=0 && arg0 == 0 first argument is 0 28
  • 29. Action • An Action... > is executed when a probe fires > has two categories – Data Recording Action/Destructive Action Action Explanation trace() trace the D expression results printf() print something using C-style printf() printa() print the aggregations ustack() print the user stack trace stack() print the kernel stack trace tracemem() copy data from an address in memory to a buffer breakpoint() a kernel breakpoint, causes system drop into kmdb panic() cause a kernel panic chill() spin for the specified number of nanoseconds 29
  • 30. Aggregation • Aggregation syntax > @name[ keys ] = aggfunc( args ); Functions Explanation count() times that the count function is called sum() total value of the specified expressions avg() arithmetic average of the specified expressions min() smallest value among the specified expressions max() largest value among the specified expressions A linear frequency distribution of the values of the lquantize() specified expressions that is sized by the specified range A power of 2 frequency distribution of the values quantize() of the specified expressions. 30
  • 31. Variables > Scalar Variables – Represent individual fixed-size data objects > Associative Arrays – name [ key ] = expression ; > Thread-Local Variables – self->[variable name] > Clause-Local Variables – this->[variable name] > Built-in Variables – pre-defined scalar global variables > External Variables – the ”`” is a scoping operator for accessing variables that are defined in the OS, eg: `kmem_flags 31
  • 32. Built-in Variables Type and Name Explanation int64_t arg0...arg9 The first 10 input arguments cpuinfo_t *curcpu The CPU information for the current CPU. processorid_t cpu The CPU identifier for the current CPU. kthread_t *curthread kthread_t address for current kernel thread pid_t pid The process ID of the current process pid_t ppid parent process ID of the current process uint_t ipl IPL on the current CPU at probe firing time int errno Error value returned by the last system call string execname name passed to exec(2) to execute the process A nanosecond timestamp counter, it increments uint64_t timestamp from an arbitrary point in the past and should only be used for relative computations A nanosecond timestamp counter that is the time uint64_t vtimestamp of the current thread has been running on a CPU, minus the time spent in predicates and actions 32
  • 33. Agenda • Kernel Debug Overview • Modular Debugger - Mdb • Dynamic Tracing - DTrace • References 33
  • 34. Documentations & Links - Mdb • Solaris Internals Second Edition > www.solarisinternals.com • Solaris Modular Debugger Guide > docs.sun.com/app/docs/doc/817-2543 • OpenSolaris mdb community > opensolaris.org/os/community/mdb • Crash Dump analysis > opensolaris.org/os/community/documentation /files/book.pdf 34
  • 35. Documentations & Links - DTrace • Solaris Internals Second Edition > www.solarisinternals.com/wiki/index.php/DTr ace_Topics • Solaris Dynamic Tracing Guide > docs.sun.com/app/docs/doc/819-3620 • OpenSolaris DTrace community > opensolaris.org/os/community/dtrace • DTrace Tools > www.brendangregg.com/dtrace.html 35