SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Brought to you by
OSNoise Tracer: Who Is
Stealing My CPU Time?
Daniel Bristot de Oliveira, Ph.D.
Principal Software Engineer at
Daniel Bristot de Oliveira
Principal Software Engineer at Red Hat
■ Kernel developer with interest in RT/RV theoretical aspects
■ I am here to share some research that landed into Linux
■ Post-doc researcher at Scuola Superiore Sant'Anna
■ AFK: Photography for mental health, cycling for body health
I am a regular user, how can I see
who is stealing my cpu time?
Use top
top - 18:22:44 up 6 days, 10:05, 1 user, load average: 1.06, 0.95, 0.77
Tasks: 341 total, 3 running, 338 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.1 us, 2.9 sy, 0.0 ni, 85.8 id, 0.1 wa, 0.6 hi, 0.4 si, 0.0 st
MiB Mem : 15765.3 total, 1518.5 free, 5059.2 used, 9187.5 buff/cache
MiB Swap: 8192.0 total, 8179.7 free, 12.2 used. 8544.4 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2156 bristot 20 0 5582816 304464 109084 S 25.7 1.9 125:07.16 gnome-shell
281950 bristot 20 0 20.7g 383492 187800 S 19.1 2.4 2:15.53 chrome
281405 bristot 20 0 16.9g 372696 215632 S 18.8 2.3 3:23.05 chrome
281455 bristot 20 0 17.3g 196280 128200 S 18.5 1.2 3:11.31 chrome
1961 bristot 20 0 1365748 200180 138840 R 10.2 1.2 104:28.12 Xorg
286778 bristot 20 0 20.6g 152708 106008 S 2.3 0.9 0:19.14 chrome
280507 bristot 20 0 767428 53824 37708 R 2.0 0.3 0:07.24 gnome-terminal-
281456 bristot 20 0 16.5g 121220 91820 S 1.7 0.8 0:41.24 chrome
Brought to you by
Daniel Bristot de Oliveira
bristot@kernel.org
@bristot
That is it!?
Wait, there are still some slides
for the non-regular users.
Introduction
What is OS Noise?
■ The Operating Systems Noise (OS Noise) is a well defined High
Performance Computing (HPC) metric
■ It is the amount of interference experienced by an application due to
operating system activities
■ It is generally a fine grained metric
HPC Workload
■ Generally, HPC workloads are composed of parallel jobs
■ The system is configured with CPUs dedicated to the jobs
■ A dispatcher lunches jobs to these CPUS and wait for completion.
DISPATCHER
Job 2
Job 3
...
Job n
HOUSEKEEP
Job 1
Done
Wait
HPC Workload
■ The problem with OS Noise is that the OS interference on a
single job can cause a delay in the entire task:
DISPATCHER
Job 2
Job 3
...
Job n
HOUSEKEEP
Job 1
Done
Wait
HPC meets Low Latency
■ Low latency communication are touching the sub millisecond range
● Allowing low latency services in the cloud
■ Many new options enabled by 5G
■ And Linux is becoming part of the core of the network with of NFV
■ This work is not parallel like regular HPC, but serial among the hops.
● The effect of OS Noise are cumulative in the round-trip time
■ Many providers request latency in the order of microseconds.
the state-of-art
How OS Noise is measured today?
■ A tool in user-space reads the time in a loop
● Computes the delta between two time reads
● Report each delta > threshold as a "jitter" or "latency"
■ For the bugging, the user needs to setup a set of tracing events
● The user-space tool stops the trace when hitting a "spike"
■ Human interpretation of the trace
Problems with the current approach
■ The trace and the benchmark tool are not synchronized
● This leaves gaps for interpretation and "doubts"
● Requires the trace of multiple events - to be interpreted by a human
● Too much room for speculation
■ There is no clear definition of the metric
● And so no clear method to debug it
■ Other tools are required for hw/virtualization induced noise:
● most notably hwlat detector
How can these problems be solved?
■ Improve the information given by the measuring tool
● Informing reasons for a given "noise" occurrence
■ Making the workload and the trace to be in sync
● The workload and the trace needs to "atomically" in sync
■ Tracing automation
● Define/standardize the most essential information
● Reduce the amount of events passed to the user (to reduce overhead)
● Do the common interpretation before "printing" the trace
osnoise tracer
osnoise tracer
■ Osnoise is a kernel tracer that also dispatches the workload
● The workload runs in kernel
■ The workload and the trace are synchronized
● Likewise other tools, osnoise measures the time delta
● But it also measures the amount of interference from OS Operations
Enabling osnoise
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
[root@f32 tracing]# cat trace
osnoise tracer output
_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth MAX
|| / SINGLE Interference counters:
|||| RUNTIME NOISE % OF CPU NOISE +----------------------------+
TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
| | | |||| | | | | | | | | | |
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
osnoise tracer config
■ Configuration files inside /sys/kernel/trace/osnoise
● cpus: CPUs at which a osnoise thread will execute.
● period_us: the period of the osnoise thread.
● runtime_us: how long an osnoise thread will look for noise in the period
● stop_tracing_us: stop the system tracing if a single noise is >= than set here
● stop_tracing_total_us: stop the system tracing if total noise is >= than set here
■ /sys/kernel/trace/tracing_threshold
● The minimum delta between two time() reads to be considered as noise, in us.
● When set to 0, the default value will will be used, which is currently 5 us.
Finding sources of noise
What can steal your cpu time?
■ Characterization of osnoise:
● Any sort of task tha interference (preempt) the osnoise workload
■ Linux task abstractions:
● NMI
● IRQs
● Softirqs
● Threads
■ But also the hardware can interfere on your task
● SMIs
● VMs
■ The osnoise: tracepoints process all the data in kernel, in sync with the tracer
● To reduce overhead when enabled
● No overhead when disabled
■ The events are:
● osnoise:nmi_noise: noise from NMI, including the duration.
● osnoise:irq_noise: noise from an IRQ, including the duration.
● osnoise:softirq_noise: noise from a SoftIRQ, including the duration.
● osnoise:thread_noise: noise from a thread, including the duration.
● osnoise:sample_threshold: printed anytime a noise is found, including the $ of interferences
Osnoise tracepoints
osnoise tracer output
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
[root@f32 tracing]# echo osnoise > set_event
[root@f32 tracing]# echo 8 > osnoise/stop_tracing_us
[root@f32 tracing]# cat trace
[...]
osnoise/8-960 [007] 5789.857530: irq_noise: local_timer:236 start 5789.857527123 duration 1867 ns
osnoise/8-961 [008] 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
osnoise/8-961 [008] 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
migration/8-54 [008] 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
osnoise/8-961 [008] 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns
interferences 2
■ osnoise tracks all sources of OS Noise
■ osnoise computes the delta time and the interference counter on every loop
■ When the interference counter == 0:
● The cause of the noise is from outside the OS
● It is computed as hardware, like hwlat detector does
● The hardware can be either physical or virtual VMs
How about hardware noise?
hardware counter output
_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth MAX
|| / SINGLE Interference counters:
|||| RUNTIME NOISE % OF CPU NOISE +----------------------------+
TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
| | | |||| | | | | | | | | | |
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
hardware noise output
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
[root@f32 tracing]# echo osnoise > set_event
[root@f32 tracing]# echo 8 > osnoise/stop_tracing_us
[root@f32 tracing]# cat trace
[...]
osnoise/0-713 [000] 66.570788: irq_noise: local_timer:236 start 66.570786364 duration 1593 ns
osnoise/2-715 [002] 66.570788: irq_noise: local_timer:236 start 66.570786364 duration 1628 ns
osnoise/4-717 [004] 66.570788: irq_noise: local_timer:236 start 66.570786377 duration 1568 ns
osnoise/0-713 [000] 66.570871: sample_threshold: start 66.570862574 duration 8038 ns interference 0
osnoise/3-716 [003] 66.571788: irq_noise: local_timer:236 start 66.571786373 duration 1555 ns
osnoise/7-720 [007] 66.571788: irq_noise: local_timer:236 start 66.571786396 duration 1536 ns
■ osnoise tracer puts the workload and the tracer in a single tool
■ Provides information and tracing that points to the root cause of OS Noise
● Processing the data to reduce the overhead to the minimum possible
■ It can also be used to detect hardware latency
■ The timerlat tracer is osnoise's sibling for interrupt based latency.
■ It is part of the kernel and is enabled on recent Fedora/CentOS/Red Hat
■ rtla tool adds an intuitive interface for osnoise tracer
Final thoughts
Brought to you by
Daniel Bristot de Oliveira
bristot@kernel.org
@bristot

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Monitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTapMonitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTap
 
Ansible roles done right
Ansible roles done rightAnsible roles done right
Ansible roles done right
 
Effective service and resource management with systemd
Effective service and resource management with systemdEffective service and resource management with systemd
Effective service and resource management with systemd
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Linux crontab
Linux crontabLinux crontab
Linux crontab
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
BKK16-317 How to generate power models for EAS and IPA
BKK16-317 How to generate power models for EAS and IPABKK16-317 How to generate power models for EAS and IPA
BKK16-317 How to generate power models for EAS and IPA
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
LCU14-410: How to build an Energy Model for your SoC
LCU14-410: How to build an Energy Model for your SoCLCU14-410: How to build an Energy Model for your SoC
LCU14-410: How to build an Energy Model for your SoC
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 

Similar a OSNoise Tracer: Who Is Stealing My CPU Time?

Understanding Performance with DTrace
Understanding Performance with DTraceUnderstanding Performance with DTrace
Understanding Performance with DTrace
ahl0003
 
Fine grained monitoring
Fine grained monitoringFine grained monitoring
Fine grained monitoring
Iben Rodriguez
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
CanSecWest
 
Columnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to comeColumnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to come
Wang Zuo
 

Similar a OSNoise Tracer: Who Is Stealing My CPU Time? (20)

Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
 
Understanding Performance with DTrace
Understanding Performance with DTraceUnderstanding Performance with DTrace
Understanding Performance with DTrace
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014
 
Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...
Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...
Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Ro...
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Fine grained monitoring
Fine grained monitoringFine grained monitoring
Fine grained monitoring
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
 
test
testtest
test
 
Linux 系統管理與安全:進階系統管理系統防駭與資訊安全
Linux 系統管理與安全:進階系統管理系統防駭與資訊安全Linux 系統管理與安全:進階系統管理系統防駭與資訊安全
Linux 系統管理與安全:進階系統管理系統防駭與資訊安全
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crash
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
Columnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to comeColumnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to come
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Broken Performance Tools
Broken Performance ToolsBroken Performance Tools
Broken Performance Tools
 
Analyze database system using a 3 d method
Analyze database system using a 3 d methodAnalyze database system using a 3 d method
Analyze database system using a 3 d method
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
 
When the OS gets in the way
When the OS gets in the wayWhen the OS gets in the way
When the OS gets in the way
 

Más de ScyllaDB

Más de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

OSNoise Tracer: Who Is Stealing My CPU Time?

  • 1. Brought to you by OSNoise Tracer: Who Is Stealing My CPU Time? Daniel Bristot de Oliveira, Ph.D. Principal Software Engineer at
  • 2. Daniel Bristot de Oliveira Principal Software Engineer at Red Hat ■ Kernel developer with interest in RT/RV theoretical aspects ■ I am here to share some research that landed into Linux ■ Post-doc researcher at Scuola Superiore Sant'Anna ■ AFK: Photography for mental health, cycling for body health
  • 3. I am a regular user, how can I see who is stealing my cpu time?
  • 4. Use top top - 18:22:44 up 6 days, 10:05, 1 user, load average: 1.06, 0.95, 0.77 Tasks: 341 total, 3 running, 338 sleeping, 0 stopped, 0 zombie %Cpu(s): 10.1 us, 2.9 sy, 0.0 ni, 85.8 id, 0.1 wa, 0.6 hi, 0.4 si, 0.0 st MiB Mem : 15765.3 total, 1518.5 free, 5059.2 used, 9187.5 buff/cache MiB Swap: 8192.0 total, 8179.7 free, 12.2 used. 8544.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2156 bristot 20 0 5582816 304464 109084 S 25.7 1.9 125:07.16 gnome-shell 281950 bristot 20 0 20.7g 383492 187800 S 19.1 2.4 2:15.53 chrome 281405 bristot 20 0 16.9g 372696 215632 S 18.8 2.3 3:23.05 chrome 281455 bristot 20 0 17.3g 196280 128200 S 18.5 1.2 3:11.31 chrome 1961 bristot 20 0 1365748 200180 138840 R 10.2 1.2 104:28.12 Xorg 286778 bristot 20 0 20.6g 152708 106008 S 2.3 0.9 0:19.14 chrome 280507 bristot 20 0 767428 53824 37708 R 2.0 0.3 0:07.24 gnome-terminal- 281456 bristot 20 0 16.5g 121220 91820 S 1.7 0.8 0:41.24 chrome
  • 5. Brought to you by Daniel Bristot de Oliveira bristot@kernel.org @bristot That is it!?
  • 6. Wait, there are still some slides for the non-regular users.
  • 8. What is OS Noise? ■ The Operating Systems Noise (OS Noise) is a well defined High Performance Computing (HPC) metric ■ It is the amount of interference experienced by an application due to operating system activities ■ It is generally a fine grained metric
  • 9. HPC Workload ■ Generally, HPC workloads are composed of parallel jobs ■ The system is configured with CPUs dedicated to the jobs ■ A dispatcher lunches jobs to these CPUS and wait for completion. DISPATCHER Job 2 Job 3 ... Job n HOUSEKEEP Job 1 Done Wait
  • 10. HPC Workload ■ The problem with OS Noise is that the OS interference on a single job can cause a delay in the entire task: DISPATCHER Job 2 Job 3 ... Job n HOUSEKEEP Job 1 Done Wait
  • 11. HPC meets Low Latency ■ Low latency communication are touching the sub millisecond range ● Allowing low latency services in the cloud ■ Many new options enabled by 5G ■ And Linux is becoming part of the core of the network with of NFV ■ This work is not parallel like regular HPC, but serial among the hops. ● The effect of OS Noise are cumulative in the round-trip time ■ Many providers request latency in the order of microseconds.
  • 13. How OS Noise is measured today? ■ A tool in user-space reads the time in a loop ● Computes the delta between two time reads ● Report each delta > threshold as a "jitter" or "latency" ■ For the bugging, the user needs to setup a set of tracing events ● The user-space tool stops the trace when hitting a "spike" ■ Human interpretation of the trace
  • 14. Problems with the current approach ■ The trace and the benchmark tool are not synchronized ● This leaves gaps for interpretation and "doubts" ● Requires the trace of multiple events - to be interpreted by a human ● Too much room for speculation ■ There is no clear definition of the metric ● And so no clear method to debug it ■ Other tools are required for hw/virtualization induced noise: ● most notably hwlat detector
  • 15. How can these problems be solved? ■ Improve the information given by the measuring tool ● Informing reasons for a given "noise" occurrence ■ Making the workload and the trace to be in sync ● The workload and the trace needs to "atomically" in sync ■ Tracing automation ● Define/standardize the most essential information ● Reduce the amount of events passed to the user (to reduce overhead) ● Do the common interpretation before "printing" the trace
  • 17. osnoise tracer ■ Osnoise is a kernel tracer that also dispatches the workload ● The workload runs in kernel ■ The workload and the trace are synchronized ● Likewise other tools, osnoise measures the time delta ● But it also measures the amount of interference from OS Operations
  • 18. Enabling osnoise [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer [root@f32 tracing]# cat trace
  • 19. osnoise tracer output _-----=> irqs-off / _----=> need-resched | / _---=> hardirq/softirq || / _--=> preempt-depth MAX || / SINGLE Interference counters: |||| RUNTIME NOISE % OF CPU NOISE +----------------------------+ TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD | | | |||| | | | | | | | | | | <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
  • 20. osnoise tracer config ■ Configuration files inside /sys/kernel/trace/osnoise ● cpus: CPUs at which a osnoise thread will execute. ● period_us: the period of the osnoise thread. ● runtime_us: how long an osnoise thread will look for noise in the period ● stop_tracing_us: stop the system tracing if a single noise is >= than set here ● stop_tracing_total_us: stop the system tracing if total noise is >= than set here ■ /sys/kernel/trace/tracing_threshold ● The minimum delta between two time() reads to be considered as noise, in us. ● When set to 0, the default value will will be used, which is currently 5 us.
  • 22. What can steal your cpu time? ■ Characterization of osnoise: ● Any sort of task tha interference (preempt) the osnoise workload ■ Linux task abstractions: ● NMI ● IRQs ● Softirqs ● Threads ■ But also the hardware can interfere on your task ● SMIs ● VMs
  • 23. ■ The osnoise: tracepoints process all the data in kernel, in sync with the tracer ● To reduce overhead when enabled ● No overhead when disabled ■ The events are: ● osnoise:nmi_noise: noise from NMI, including the duration. ● osnoise:irq_noise: noise from an IRQ, including the duration. ● osnoise:softirq_noise: noise from a SoftIRQ, including the duration. ● osnoise:thread_noise: noise from a thread, including the duration. ● osnoise:sample_threshold: printed anytime a noise is found, including the $ of interferences Osnoise tracepoints
  • 24. osnoise tracer output [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer [root@f32 tracing]# echo osnoise > set_event [root@f32 tracing]# echo 8 > osnoise/stop_tracing_us [root@f32 tracing]# cat trace [...] osnoise/8-960 [007] 5789.857530: irq_noise: local_timer:236 start 5789.857527123 duration 1867 ns osnoise/8-961 [008] 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns osnoise/8-961 [008] 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns migration/8-54 [008] 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns osnoise/8-961 [008] 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2
  • 25. ■ osnoise tracks all sources of OS Noise ■ osnoise computes the delta time and the interference counter on every loop ■ When the interference counter == 0: ● The cause of the noise is from outside the OS ● It is computed as hardware, like hwlat detector does ● The hardware can be either physical or virtual VMs How about hardware noise?
  • 26. hardware counter output _-----=> irqs-off / _----=> need-resched | / _---=> hardirq/softirq || / _--=> preempt-depth MAX || / SINGLE Interference counters: |||| RUNTIME NOISE % OF CPU NOISE +----------------------------+ TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD | | | |||| | | | | | | | | | | <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
  • 27. hardware noise output [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer [root@f32 tracing]# echo osnoise > set_event [root@f32 tracing]# echo 8 > osnoise/stop_tracing_us [root@f32 tracing]# cat trace [...] osnoise/0-713 [000] 66.570788: irq_noise: local_timer:236 start 66.570786364 duration 1593 ns osnoise/2-715 [002] 66.570788: irq_noise: local_timer:236 start 66.570786364 duration 1628 ns osnoise/4-717 [004] 66.570788: irq_noise: local_timer:236 start 66.570786377 duration 1568 ns osnoise/0-713 [000] 66.570871: sample_threshold: start 66.570862574 duration 8038 ns interference 0 osnoise/3-716 [003] 66.571788: irq_noise: local_timer:236 start 66.571786373 duration 1555 ns osnoise/7-720 [007] 66.571788: irq_noise: local_timer:236 start 66.571786396 duration 1536 ns
  • 28. ■ osnoise tracer puts the workload and the tracer in a single tool ■ Provides information and tracing that points to the root cause of OS Noise ● Processing the data to reduce the overhead to the minimum possible ■ It can also be used to detect hardware latency ■ The timerlat tracer is osnoise's sibling for interrupt based latency. ■ It is part of the kernel and is enabled on recent Fedora/CentOS/Red Hat ■ rtla tool adds an intuitive interface for osnoise tracer Final thoughts
  • 29. Brought to you by Daniel Bristot de Oliveira bristot@kernel.org @bristot