SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Brendan Gregg, Senior Performance Architect
Designing
Tracing
Tools
Wielding Superpowers
I'm currently developing more tracing tools (bcc/BPF)
Tool Design
•  For tool developers
•  For everyone else: what you can ask for
–  Tool templates
–  GUI visualizations
•  The following is applicable to all tracers
–  sysdig, bcc/BPF, DTrace, SystemTap, LTTng, …
Methodologies
Methodology-driven Design
•  Methodologies provide ideas for purposeful tools
•  Find/draw a functional diagram, apply methods
See: http://www.brendangregg.com/methodology.html
Operating Systems
Methodology Examples
Eg, at the syscall layer (well known & documented):
•  Workload Characterization
–  exec() or open() per-event trace (execsnoop, opensnoop)
–  connect()/accept() per-event trace (tcpconnect, tcpaccept)
–  read()/write() size histogram (one-liners)
•  Latency Analysis
–  read()/write() latency histogram (funclatency, …)
•  USE Method
–  network utilization by thread (not done yet)
–  syscall errors (fserrors, soerrors)
CLI Tracing Tools
CLI Templates
1.  Per event output
–  *snoop, *slower 0, …
2.  Filtered event output
–  *slower
3.  Interval summary
–  *stat, *top
4.  Count summary
–  *count
5.  Histogram summary
–  *dist, *latency
6.  Heatmap summary
–  spectrogram.lua, subsecoffset.lua, …
Template 1: Per Event Output
Examples: *snoop, *slower 0, …
# opensnoop
PID COMM FD ERR PATH
10085 sshd 3 0 /lib/x86_64-linux-gnu/libkeyutils.so.1
10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2
10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0
10085 sshd 3 0 /dev/urandom
10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac
10085 sshd -1 2 /proc/sys/crypto/fips_enabled
10085 sshd 3 0 /proc/filesystems
10085 sshd 3 0 /dev/null
10085 sshd 3 0 /proc/10085/fd
10085 sshd 3 0 /usr/lib/ssl/openssl.cnf
10085 sshd 3 0 /etc/gai.conf
10085 sshd 3 0 /etc/nsswitch.conf
10085 sshd 3 0 /etc/ld.so.cache
10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_compat.so.2
10085 sshd 3 0 /etc/ld.so.cache
10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_nis.so.2
[…]
Template 2: Filtered Event Output
Examples: *slower
Tools like this can also do all event output:
# sysdig -c fileslower 1
TIME PROCESS TYPE LAT(ms) FILE
2014-04-13 20:40:43.973 cksum read 2 /mnt/partial.0.0
2014-04-13 20:40:44.187 cksum read 1 /mnt/partial.0.0
2014-04-13 20:40:44.689 cksum read 2 /mnt/partial.0.0
2014-04-13 20:40:45.005 cksum read 2 /mnt/partial.0.0
2014-04-13 20:40:45.193 cksum read 1 /mnt/partial.0.0
[…]
# sysdig -c fileslower 0
TIME PROCESS TYPE LAT(ms) FILE
2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/librt.so.1
2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libacl.so.1
2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libc.so.6
2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libdl.so.2
2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libattr.so.1
2014-04-13 20:59:04.415 ls read 0 /proc/filesystems
2014-04-13 20:59:04.415 ls read 0 /proc/filesystems
[...]
Template 3: Interval Summary
Examples: *stat, *top
# dcstat
TIME REFS/s SLOW/s MISS/s HIT%
08:11:47: 2059 141 97 95.29
08:11:48: 79974 151 106 99.87
08:11:49: 192874 146 102 99.95
08:11:50: 2051 144 100 95.12
08:11:51: 73373 17239 17194 76.57
08:11:52: 54685 25431 25387 53.58
08:11:53: 18127 8182 8137 55.12
08:11:54: 22517 10345 10301 54.25
08:11:55: 7524 2881 2836 62.31
08:11:56: 2067 141 97 95.31
08:11:57: 2115 145 101 95.22
[…]
Template 4: Count Summary
Examples: *count
# funccount 'vfs_*'
Tracing... Ctrl-C to end.
^C
ADDR FUNC COUNT
ffffffff811efe81 vfs_create 1
ffffffff811f24a1 vfs_rename 1
ffffffff81215191 vfs_fsync_range 2
ffffffff81231df1 vfs_lock_file 30
ffffffff811e8dd1 vfs_fstatat 152
ffffffff811e8d71 vfs_fstat 154
ffffffff811e4381 vfs_write 166
ffffffff811e8c71 vfs_getattr_nosec 262
ffffffff811e8d41 vfs_getattr 262
ffffffff811e3221 vfs_open 264
ffffffff811e4251 vfs_read 470
Detaching...
Template 5: Histogram Summary
Examples: *dist, *latency
# biolatency
Tracing block device I/O... Hit Ctrl-C to end.
^C
usecs : count distribution
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 1 | |
128 -> 255 : 12 |******** |
256 -> 511 : 15 |********** |
512 -> 1023 : 43 |******************************* |
1024 -> 2047 : 52 |**************************************|
2048 -> 4095 : 47 |********************************** |
4096 -> 8191 : 52 |**************************************|
8192 -> 16383 : 36 |************************** |
16384 -> 32767 : 15 |********** |
32768 -> 65535 : 2 |* |
65536 -> 131071 : 2 |* |
Template 6: Heatmap Summary
Example: subsecoffset.lua (aka "spectrogram")
Valuable
Know what already exists, and what doesn't
Low Overhead (or documented)
•  Understand tracing internals
–  For example, sysdig's design has ~20x lower overhead than strace
(it still has overhead: test and measure to see if it's acceptable)
–  Tracing overhead is usually relative to event rate
•  Design for low overhead, and document expectations
sysdig
1. enable
Kernel
syscalls
sysdig
driver
ring
buffer
lua
program
2. async
read
3. output
Documentation
•  Good tools have 3 docs:
1.  Code comments
2.  Man page
3.  Examples file
•  Man page
–  troff, docbook, …
•  Examples file:
.TH Title heading
.SH Section heading
.IP Indented paragraph
.TP Indented paragraph with label
.B Bold
- -
common man macros (see groff_man(7))
Demonstrations of biosnoop, the Linux eBPF/bcc version.
biosnoop traces block device I/O (disk I/O), and prints a line of output
per I/O. Example:
# ./biosnoop
TIME(s) COMM PID DISK T SECTOR BYTES LAT(ms)
0.000004001 supervise 1950 xvda1 W 13092560 4096 0.74
[...]
Concise, intuitive, self-explanatory
•  Useful startup message
–  What I'm tracing, when there's output, when I'll end
•  Vigorous tooling is concise
–  No wasted text; leave less useful output for non-default options
•  Unix philosophy: do one thing and do it well
# ./iolatency
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4381 |######################################|
1 -> 2 : 9 |# |
2 -> 4 : 5 |# |
4 -> 8 : 0 | |
8 -> 16 : 1 |# |
[…]
POSIX-style Arguments
# ./biolatency -h
usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count]
Summarize block device I/O latency as a histogram
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-Q, --queued include OS queued time in I/O time
-m, --milliseconds millisecond histogram
-D, --disks print a histogram per disk device
examples:
./biolatency # summarize block I/O latency as a histogram
./biolatency 1 10 # print 1 second summaries, 10 times
./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps
./biolatency -Q # include OS queued time in I/O time
./biolatency -D # show each disk device separately
Option Alternate Expectation
-a --all all events
-c CMD --cmd … run this command
-d SECONDS --duration … duration of tool execution
-h --help help
-i FILE --input … input file
-i SECONDS --interval … summary interval
-n name --name … this process name only
-o FILE --output … output file
-p PID --pid … this process ID only
-P --by-process per-process ID breakdown
-P PORT --port … this TCP port only
-t or -T --[no]timestamp include or exclude timestamps
-v --verbose verbose output
-x --extended, --errors extended output, or only failures
[interval [count]] - summary interval, and # of outputs
POSIX-style Arguments
Testing, Testing, Testing
•  If you can't write the workload, you can't write the tool
–  Be it 10 lines of C, some shell, or dd
–  dd if=/dev/urandom of=/dev/null bs=1k count=23
•  Known workload testing: create 23 events
•  Testing can be time consuming
–  I spend more time testing a tool than writing it
–  Automatic tool testing is a difficult problem
Example: gethostlatency
# gethostlatency
TIME PID COMM LATms HOST
06:10:24 28011 wget 90.00 www.iovisor.org
06:10:28 28127 wget 0.00 www.iovisor.org
06:10:41 28404 wget 9.00 www.netflix.com
06:10:48 28544 curl 35.00 www.netflix.com.au
06:11:10 29054 curl 31.00 www.plumgrid.com
06:11:16 29195 curl 3.00 www.facebook.com
06:11:25 29404 curl 72.00 foo
06:11:28 29475 curl 1.00 foo
Example: ext4slower
# ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:49:17 bash 3616 R 128 0 7.75 cksum
06:49:17 cksum 3616 R 39552 0 1.34 [
06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7
06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4
06:49:17 cksum 3616 R 10320 0 6.82 411toppm
06:49:17 cksum 3616 R 65536 0 4.01 a2p
06:49:17 cksum 3616 R 55400 0 8.77 ab
06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14
06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen
06:49:17 cksum 3616 R 6123 0 17.23 add-apt-
repository
06:49:17 cksum 3616 R 6280 0 18.40 addpart
06:49:17 cksum 3616 R 27696 0 2.16 addr2line
06:49:17 cksum 3616 R 58080 0 10.11 ag
06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data
[…]
Example: biolatency
# biolatency -m 1 5
Tracing block device I/O... Hit Ctrl-C to end.
msecs : count distribution
0 -> 1 : 36 |**************************************|
2 -> 3 : 1 |* |
4 -> 7 : 3 |*** |
8 -> 15 : 17 |***************** |
16 -> 31 : 33 |********************************** |
32 -> 63 : 7 |******* |
64 -> 127 : 6 |****** |
[…]
GUI Tracing Tools
GUI Visualizations
1.  Event logs
2.  Tables
3.  Line graphs
4.  Histograms
5.  Heatmaps (spectrographs)
6.  Waterfall charts
7.  Directed graphs
8.  Flame graphs
9.  Flame charts
Visualization 1: Event Logs
https://commons.wikimedia.org/wiki/File:Wireshark_screenshot.png
Visualization 2: Tables
Visualization 3: Line Graphs
http://www.paradyn.org/html/screen-shots.html
Visualization 4: Histograms
Or a density plot
Or as a frequency trail (can cascade)
Visualization 5: Heat Maps
eg, Oracle ZFS Storage Appliance Analytics (DTrace-based)
Visualization 5: Spectrograms
Visualization 6: Waterfall Charts
Visualization 7: Directed Graphs
Visualization 8: Flame Graphs
Commonly used with CPU profilers. Also useful for tracers: off-CPU time, ...
file read
from disk
directory read
from disk
pipe write
path read from disk
fstat from disk
Visualization 9: Flame Charts
Desirable Attributes
•  Valuable
–  Methodologies provide ideas for purposeful metrics
•  Documented
–  Tool tips, wikis
•  Tested
•  Real Time
•  Dashboards
–  To support methodologies
Thank You!
http://www.brendangregg.com
http://slideshare.net/brendangregg
bgregg@netflix.com
@brendangregg
References & Links:
–  http://www.brendangregg.com/heatmaps.html
–  http://www.brendangregg.com/flamegraphs.html
–  http://www.slideshare.net/brendangregg/monitorama-2015-netflix-instance-analysis

Más contenido relacionado

La actualidad más candente

EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
Brendan Gregg
 

La actualidad más candente (20)

Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
A Brief History of System Calls
A Brief History of System CallsA Brief History of System Calls
A Brief History of System Calls
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
 
LISA2010 visualizations
LISA2010 visualizationsLISA2010 visualizations
LISA2010 visualizations
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 

Similar a Designing Tracing Tools

Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
Brendan Gregg
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
Brendan Gregg
 
101 3.2 process text streams using filters
101 3.2 process text streams using filters101 3.2 process text streams using filters
101 3.2 process text streams using filters
Acácio Oliveira
 
It802 bruning
It802 bruningIt802 bruning
It802 bruning
mrbruning
 
Servers and Processes: Behavior and Analysis
Servers and Processes: Behavior and AnalysisServers and Processes: Behavior and Analysis
Servers and Processes: Behavior and Analysis
dreamwidth
 
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Anne Nicolas
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
Brendan Gregg
 
Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questions
Teja Bheemanapally
 

Similar a Designing Tracing Tools (20)

Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
 
SOFA Tutorial
SOFA TutorialSOFA Tutorial
SOFA Tutorial
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems Performance
 
101 3.2 process text streams using filters
101 3.2 process text streams using filters101 3.2 process text streams using filters
101 3.2 process text streams using filters
 
It802 bruning
It802 bruningIt802 bruning
It802 bruning
 
Servers and Processes: Behavior and Analysis
Servers and Processes: Behavior and AnalysisServers and Processes: Behavior and Analysis
Servers and Processes: Behavior and Analysis
 
Debug generic process
Debug generic processDebug generic process
Debug generic process
 
Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
 
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
 
Linux Common Command
Linux Common CommandLinux Common Command
Linux Common Command
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questions
 

Más de Brendan Gregg

Más de Brendan Gregg (19)

IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
LPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
YOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
 
FlameScope 2018
FlameScope 2018FlameScope 2018
FlameScope 2018
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Designing Tracing Tools

  • 1. Brendan Gregg, Senior Performance Architect Designing Tracing Tools
  • 3. I'm currently developing more tracing tools (bcc/BPF)
  • 4. Tool Design •  For tool developers •  For everyone else: what you can ask for –  Tool templates –  GUI visualizations •  The following is applicable to all tracers –  sysdig, bcc/BPF, DTrace, SystemTap, LTTng, …
  • 6. Methodology-driven Design •  Methodologies provide ideas for purposeful tools •  Find/draw a functional diagram, apply methods See: http://www.brendangregg.com/methodology.html Operating Systems
  • 7. Methodology Examples Eg, at the syscall layer (well known & documented): •  Workload Characterization –  exec() or open() per-event trace (execsnoop, opensnoop) –  connect()/accept() per-event trace (tcpconnect, tcpaccept) –  read()/write() size histogram (one-liners) •  Latency Analysis –  read()/write() latency histogram (funclatency, …) •  USE Method –  network utilization by thread (not done yet) –  syscall errors (fserrors, soerrors)
  • 9. CLI Templates 1.  Per event output –  *snoop, *slower 0, … 2.  Filtered event output –  *slower 3.  Interval summary –  *stat, *top 4.  Count summary –  *count 5.  Histogram summary –  *dist, *latency 6.  Heatmap summary –  spectrogram.lua, subsecoffset.lua, …
  • 10. Template 1: Per Event Output Examples: *snoop, *slower 0, … # opensnoop PID COMM FD ERR PATH 10085 sshd 3 0 /lib/x86_64-linux-gnu/libkeyutils.so.1 10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2 10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0 10085 sshd 3 0 /dev/urandom 10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac 10085 sshd -1 2 /proc/sys/crypto/fips_enabled 10085 sshd 3 0 /proc/filesystems 10085 sshd 3 0 /dev/null 10085 sshd 3 0 /proc/10085/fd 10085 sshd 3 0 /usr/lib/ssl/openssl.cnf 10085 sshd 3 0 /etc/gai.conf 10085 sshd 3 0 /etc/nsswitch.conf 10085 sshd 3 0 /etc/ld.so.cache 10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_compat.so.2 10085 sshd 3 0 /etc/ld.so.cache 10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_nis.so.2 […]
  • 11. Template 2: Filtered Event Output Examples: *slower Tools like this can also do all event output: # sysdig -c fileslower 1 TIME PROCESS TYPE LAT(ms) FILE 2014-04-13 20:40:43.973 cksum read 2 /mnt/partial.0.0 2014-04-13 20:40:44.187 cksum read 1 /mnt/partial.0.0 2014-04-13 20:40:44.689 cksum read 2 /mnt/partial.0.0 2014-04-13 20:40:45.005 cksum read 2 /mnt/partial.0.0 2014-04-13 20:40:45.193 cksum read 1 /mnt/partial.0.0 […] # sysdig -c fileslower 0 TIME PROCESS TYPE LAT(ms) FILE 2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/librt.so.1 2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libacl.so.1 2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libc.so.6 2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libdl.so.2 2014-04-13 20:59:04.414 ls read 0 /lib/x86_64-linux-gnu/libattr.so.1 2014-04-13 20:59:04.415 ls read 0 /proc/filesystems 2014-04-13 20:59:04.415 ls read 0 /proc/filesystems [...]
  • 12. Template 3: Interval Summary Examples: *stat, *top # dcstat TIME REFS/s SLOW/s MISS/s HIT% 08:11:47: 2059 141 97 95.29 08:11:48: 79974 151 106 99.87 08:11:49: 192874 146 102 99.95 08:11:50: 2051 144 100 95.12 08:11:51: 73373 17239 17194 76.57 08:11:52: 54685 25431 25387 53.58 08:11:53: 18127 8182 8137 55.12 08:11:54: 22517 10345 10301 54.25 08:11:55: 7524 2881 2836 62.31 08:11:56: 2067 141 97 95.31 08:11:57: 2115 145 101 95.22 […]
  • 13. Template 4: Count Summary Examples: *count # funccount 'vfs_*' Tracing... Ctrl-C to end. ^C ADDR FUNC COUNT ffffffff811efe81 vfs_create 1 ffffffff811f24a1 vfs_rename 1 ffffffff81215191 vfs_fsync_range 2 ffffffff81231df1 vfs_lock_file 30 ffffffff811e8dd1 vfs_fstatat 152 ffffffff811e8d71 vfs_fstat 154 ffffffff811e4381 vfs_write 166 ffffffff811e8c71 vfs_getattr_nosec 262 ffffffff811e8d41 vfs_getattr 262 ffffffff811e3221 vfs_open 264 ffffffff811e4251 vfs_read 470 Detaching...
  • 14. Template 5: Histogram Summary Examples: *dist, *latency # biolatency Tracing block device I/O... Hit Ctrl-C to end. ^C usecs : count distribution 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 1 | | 128 -> 255 : 12 |******** | 256 -> 511 : 15 |********** | 512 -> 1023 : 43 |******************************* | 1024 -> 2047 : 52 |**************************************| 2048 -> 4095 : 47 |********************************** | 4096 -> 8191 : 52 |**************************************| 8192 -> 16383 : 36 |************************** | 16384 -> 32767 : 15 |********** | 32768 -> 65535 : 2 |* | 65536 -> 131071 : 2 |* |
  • 15. Template 6: Heatmap Summary Example: subsecoffset.lua (aka "spectrogram")
  • 16.
  • 17. Valuable Know what already exists, and what doesn't
  • 18. Low Overhead (or documented) •  Understand tracing internals –  For example, sysdig's design has ~20x lower overhead than strace (it still has overhead: test and measure to see if it's acceptable) –  Tracing overhead is usually relative to event rate •  Design for low overhead, and document expectations sysdig 1. enable Kernel syscalls sysdig driver ring buffer lua program 2. async read 3. output
  • 19. Documentation •  Good tools have 3 docs: 1.  Code comments 2.  Man page 3.  Examples file •  Man page –  troff, docbook, … •  Examples file: .TH Title heading .SH Section heading .IP Indented paragraph .TP Indented paragraph with label .B Bold - - common man macros (see groff_man(7)) Demonstrations of biosnoop, the Linux eBPF/bcc version. biosnoop traces block device I/O (disk I/O), and prints a line of output per I/O. Example: # ./biosnoop TIME(s) COMM PID DISK T SECTOR BYTES LAT(ms) 0.000004001 supervise 1950 xvda1 W 13092560 4096 0.74 [...]
  • 20. Concise, intuitive, self-explanatory •  Useful startup message –  What I'm tracing, when there's output, when I'll end •  Vigorous tooling is concise –  No wasted text; leave less useful output for non-default options •  Unix philosophy: do one thing and do it well # ./iolatency Tracing block I/O. Output every 1 seconds. Ctrl-C to end. >=(ms) .. <(ms) : I/O |Distribution | 0 -> 1 : 4381 |######################################| 1 -> 2 : 9 |# | 2 -> 4 : 5 |# | 4 -> 8 : 0 | | 8 -> 16 : 1 |# | […]
  • 21. POSIX-style Arguments # ./biolatency -h usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count] Summarize block device I/O latency as a histogram positional arguments: interval output interval, in seconds count number of outputs optional arguments: -h, --help show this help message and exit -T, --timestamp include timestamp on output -Q, --queued include OS queued time in I/O time -m, --milliseconds millisecond histogram -D, --disks print a histogram per disk device examples: ./biolatency # summarize block I/O latency as a histogram ./biolatency 1 10 # print 1 second summaries, 10 times ./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps ./biolatency -Q # include OS queued time in I/O time ./biolatency -D # show each disk device separately
  • 22. Option Alternate Expectation -a --all all events -c CMD --cmd … run this command -d SECONDS --duration … duration of tool execution -h --help help -i FILE --input … input file -i SECONDS --interval … summary interval -n name --name … this process name only -o FILE --output … output file -p PID --pid … this process ID only -P --by-process per-process ID breakdown -P PORT --port … this TCP port only -t or -T --[no]timestamp include or exclude timestamps -v --verbose verbose output -x --extended, --errors extended output, or only failures [interval [count]] - summary interval, and # of outputs POSIX-style Arguments
  • 23. Testing, Testing, Testing •  If you can't write the workload, you can't write the tool –  Be it 10 lines of C, some shell, or dd –  dd if=/dev/urandom of=/dev/null bs=1k count=23 •  Known workload testing: create 23 events •  Testing can be time consuming –  I spend more time testing a tool than writing it –  Automatic tool testing is a difficult problem
  • 24. Example: gethostlatency # gethostlatency TIME PID COMM LATms HOST 06:10:24 28011 wget 90.00 www.iovisor.org 06:10:28 28127 wget 0.00 www.iovisor.org 06:10:41 28404 wget 9.00 www.netflix.com 06:10:48 28544 curl 35.00 www.netflix.com.au 06:11:10 29054 curl 31.00 www.plumgrid.com 06:11:16 29195 curl 3.00 www.facebook.com 06:11:25 29404 curl 72.00 foo 06:11:28 29475 curl 1.00 foo
  • 25. Example: ext4slower # ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 06:49:17 bash 3616 R 128 0 7.75 cksum 06:49:17 cksum 3616 R 39552 0 1.34 [ 06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7 06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4 06:49:17 cksum 3616 R 10320 0 6.82 411toppm 06:49:17 cksum 3616 R 65536 0 4.01 a2p 06:49:17 cksum 3616 R 55400 0 8.77 ab 06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14 06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen 06:49:17 cksum 3616 R 6123 0 17.23 add-apt- repository 06:49:17 cksum 3616 R 6280 0 18.40 addpart 06:49:17 cksum 3616 R 27696 0 2.16 addr2line 06:49:17 cksum 3616 R 58080 0 10.11 ag 06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data […]
  • 26. Example: biolatency # biolatency -m 1 5 Tracing block device I/O... Hit Ctrl-C to end. msecs : count distribution 0 -> 1 : 36 |**************************************| 2 -> 3 : 1 |* | 4 -> 7 : 3 |*** | 8 -> 15 : 17 |***************** | 16 -> 31 : 33 |********************************** | 32 -> 63 : 7 |******* | 64 -> 127 : 6 |****** | […]
  • 28. GUI Visualizations 1.  Event logs 2.  Tables 3.  Line graphs 4.  Histograms 5.  Heatmaps (spectrographs) 6.  Waterfall charts 7.  Directed graphs 8.  Flame graphs 9.  Flame charts
  • 29. Visualization 1: Event Logs https://commons.wikimedia.org/wiki/File:Wireshark_screenshot.png
  • 31. Visualization 3: Line Graphs http://www.paradyn.org/html/screen-shots.html
  • 32. Visualization 4: Histograms Or a density plot Or as a frequency trail (can cascade)
  • 33. Visualization 5: Heat Maps eg, Oracle ZFS Storage Appliance Analytics (DTrace-based)
  • 37. Visualization 8: Flame Graphs Commonly used with CPU profilers. Also useful for tracers: off-CPU time, ... file read from disk directory read from disk pipe write path read from disk fstat from disk
  • 39. Desirable Attributes •  Valuable –  Methodologies provide ideas for purposeful metrics •  Documented –  Tool tips, wikis •  Tested •  Real Time •  Dashboards –  To support methodologies
  • 40. Thank You! http://www.brendangregg.com http://slideshare.net/brendangregg bgregg@netflix.com @brendangregg References & Links: –  http://www.brendangregg.com/heatmaps.html –  http://www.brendangregg.com/flamegraphs.html –  http://www.slideshare.net/brendangregg/monitorama-2015-netflix-instance-analysis