2. GÖKHAN ATIL
➤ DBA Team Lead with 15+ years of experience
➤ Oracle ACE Director (2016)
ACE (2011)
➤ 10g/11g and R12 OCP
➤ Founding Member and Vice President of TROUG
➤ Co-author of Expert Oracle Enterprise Manager 12c
➤ Blogger (since 2008) gokhanatil.com
➤ Twitter: @gokhanatil
2
3. INTRODUCTION
➤ This session will cover only the tools shipped with Oracle
Enterprise Linux 7 (no external repositories required).
➤ Power user tools / “root” privileges are not required.
➤ There are three main sections:
➤ Quick System Health Check (USE Method)
➤ Profilers & Tracing
➤ Other Useful Stuff
5. USE (THE UTILIZATION SATURATION AND ERRORS) METHOD
➤ For every resource, check:
1. Utilization: busy time
2. Saturation: queue length or queued time
3. Errors
➤ You may check Brendan Gregg’s website:
http://www.brendangregg.com/usemethod.html
RESOURCE
UTILIZATION (%)
Saturation
Errors
o o x o x o
CPU
RAM
Storage
Network
6. UPTIME
➤ Average number of processes (runnable + uninterruptable)
for the past 1, 5 and 15 minutes.
➤ Check if load is higher than CPU count.
➤ Useful to see the trend of “load”.
0
1
2
3
4
7. FREE
➤ Displays the total amount of free and used physical and swap
memory in the system, as well as the buffers and caches used
by the kernel.
➤ Check available memory and swap usage
➤ Information is gathered by parsing /proc/meminfo
8. TOP
➤ The top utility provides the same information with “uptime”
and “free”, and it also shows who’s consuming CPU
➤ Short-lived processes can be missing entirely!
9. TOP (CONT’D)
➤ us, user: time running un-niced user processes
➤ sy, system: time running kernel processes
➤ ni, nice: time running niced user processes
➤ id, idle: time spent in the kernel idle handler
➤ wa, IO-wait: time waiting for I/O completion
➤ hi: time spent servicing hardware interrupts
➤ si: time spent servicing software interrupts
➤ st: time stolen from this vm by the hypervisor
10. TOP (CONT’D)
➤ PID: Process Id
➤ PR: The priority of the process. The highest priority is -20 and the lowest is 20.
➤ NI: Nice value, is a way of setting your process' priority.
➤ VIRT: Virtual Memory Size (KiB)
➤ RES: Resident/non-swapped Memory Size (KiB)
➤ SHR: Shared Memory Size (KiB)
➤ S: Process Status ('R' = running, ’S' = sleeping, ’Z' = zombie )
➤ TIME+: Total CPU time the task has used since it started.
➤ COMMAND: Start top with the -c flag to see the full command line that launched the
process
12. VMSTAT
➤ vmstat reports information about processes, memory, paging,
block IO, disks and CPU activity.
➤ The first line of output shows the averages since the last
reboot.
13. VMSTAT (CONT’D)
➤ if the r (number of runnable processes) is generally higher
than number of CPUs, there’s possible a CPU bottleneck
➤ if the si + so (Swap-ins and swap-outs) are not zero, your
system needs more memory.
➤ If the wa (time waiting for I/O) column is high, there’s
possible a disk bottleneck.
14. IOSTAT
➤ iostat shows CPU and I/O statistics for devices and partitions.
iostat -x 1 100
➤ avgqu-sz: The average queue length of the requests that were
issued to the device. Higher numbers may indicate saturation!
➤ await: The average time (in milliseconds) for I/O requests.
15. MPSTAT
➤ mpstat command reports activities for each available
processor,
mpstat -P ALL 1 100
➤ Check for an imbalance. If some CPUs are busiest than
others, there could be a single-threaded application
17. SAR (SYSTEM ACTIVITY REPORT)
➤ sar displays CPU, memory, disk I/O, and network usage, both
current and historical.
➤ It uses “/var/log/sa/saXX" file to read historical data. XX is
the day of the month.
sar -f /var/log/sa/sa16
sar -f /var/log/sa/sa16 -s 07:00:00
18. SAR (CONT’D)
➤ sar can be used like mpstat:
sar -P ALL 1 100
sar P ALL -f /var/log/sa/sa16
19. SAR (CONT’D)
➤ sar can be used like iostat:
sar -p -d 1 100
Device names may also be pretty-printed if option -p is used
20. SAR (CONT’D)
➤ You can use sar to check network load and errors.
sar -n DEV,EDEV 1 100
➤ Possible keywords: DEV, EDEV, NFS, NFSD, SOCK, IP, EIP, ICMP,
EICMP, TCP, ETCP, UDP, SOCK6, IP6, EIP6, ICMP6, EICMP6 and
UDP6
21. DMESG
➤ dmesg is used to examine the kernel ring buffer. It’s a good
place to start checking if there’s any error on the system:
dmesg -T | tail 50
23. PERF
➤ perf is a performance analyzing tool in Linux, available from
Linux kernel version 2.6.31.
perf record -p XXXX sleep X
perf record program_name
perf report
➤ Performance counter summaries, including IPC:
perf stat program_name
➤ root can give access to regular users:
echo -1 > /proc/sys/kernel/perf_event_paranoid
perf top
25. STRACE
➤ strace records the system calls and the signals received by a
process.
-p: attach a process
-e trace=file,process,network,signal,ipc,desc,memory
-o: write output to a file
-f: trace child processes (fork)
-tt: include time info at the beginning of each line
-c: report a summary of time, calls, and errors for each system
call
27. LTRACE
➤ ltrace records the dynamic library calls and the signals received by
a process.
➤ Its use is very similar to strace.
-p: attach a process
-o: write output to a file
-f: trace child processes (fork)
-e: {[+-][symbol_pattern][@library_pattern]}
-c: report a summary of time, calls, and errors for each system call
-tt: include time info at the beginning of each line
-S: Display system calls as well as library calls
28. LTRACE
➤ Sample output of ltrace (tracing oracle log writer):
ltrace -tt -e pwrite64 -p 3582
ls -l /proc/3582/fd/25[89]
29. PSTACK
➤ pstack attaches to an active process and prints out an
execution stack trace.
➤ You may want to check Tanel Poder’s “Advanced Oracle
Troubleshooting Guide, Part 6: Understanding Oracle
execution plans with os_explain”
31. FILE
➤ file tool is used to determine file type.
file sqplus oracle dbca
➤ It uses magic signature file in /usr/share/misc/magic
32. DD
➤ dd can copy from a file/device to another file/device.
➤ Becareful about “conv” parameter. It should be set to
“notrunc” otherwise it will truncate the output file.
dd if=/dev/zero of=sample01.dbf bs=8192 seek=132
conv=notrunc count=1
dd if=/dev/random of=/dev/null
dd if=/dev/zero of=/dev/sdc1 count=1
35. FUSER
➤ fuser displays the PIDs of processes using the specified files
or file systems.
fuser -u *
➤ fuser can also send signals (-l to list signals, -k to kill
processes)
36. LSOF
➤ lsof lists all open files belonging to all active processes.
lsof *
➤ you can list all open files belong to a user:
lsof -uoracle
37. IPCS
➤ ipcs provides information on the inter-process communication
facilities such as shared memory segments, semaphore sets
and message queues.
ipcs
38. LDD
➤ ldd prints the shared libraries required by each program or
shared library specified on the command line.
ldd program_name/library_name