SlideShare una empresa de Scribd logo
1 de 26
Andrea Righi - andrea@betterlinux.com
Tecniche di debugging nel kernel
Linux
Andrea Righi - andrea@betterlinux.com
Agenda
● Overview (kernel programming)
● Kernel crash taxonomy
● Debugging techniques
● Example(s)
● Q/A
Andrea Righi - andrea@betterlinux.com
What's a kernel?
● The kernel provides an abstraction layer for the
applications to use the physical hardware
resources
● Kernel basic facilities
● Process management
● Memory management
● Device management
● System call interface
Andrea Righi - andrea@betterlinux.com
User space
● Good for debugging (gdb)
● Lots of user-space libraries available
● Unpredictable latency (context switch, scheduler, syscall, ...)
● Overhead
● Impossibility to fully interact with interrupt routines
● Impossibility to access certain memory address
● More difficult to share certain features with other drivers
● Reliability: user processes can be terminated upon critical
system events (OOM, filesystem errors, etc.)
Andrea Righi - andrea@betterlinux.com
Kernel space
●
Written in C and assembly
●
No debugging tool (kgdb, UML, ...)
●
Bugs can hang the entire system
● User memory is swappable, kernel memory can't be swapped out
● Kernel stack size is small (8K / 4K - THREAD_SIZE_ORDER)
● Floating point is forbidden
● Userspace libraries are not available
●
Linux kernel must be portable (this is important if you consider to
contribute mainstream)
●
Closed source kernel modules taint the kernel
Andrea Righi - andrea@betterlinux.com
Example kernel module
#include <linux/init.h>
#include <linux/module.h>
/* Module constructor */
static int __init hello_init(void)
{
printk(KERN_INFO "Hello, world!n");
return 0;
}
/* Module destructor */
static void __exit hello_exit(void)
{
printk(KERN_INFO "Goodbyen");
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Andrea Righi <andrea@betterlinux.com>");
MODULE_DESCRIPTION("BetterEmbedded hello world example");
Andrea Righi - andrea@betterlinux.com
Kernel problems
● Kernel panic (fatal error for the system)
● Kernel oops (non-fatal error)
● Wrong result (fatal from user's perspective)
Andrea Righi - andrea@betterlinux.com
Kernel panic
● No recovery is possible
● Example: exception in an atomic context (i.e.,
interrupt)
● Typically result in a system reboot (panic=N), or
blinking LED or just hang
Andrea Righi - andrea@betterlinux.com
[ 165.552280] general protection fault: 0000 [#1] PREEMPT SMP
[ 165.553055] Modules linked in: crashtest(O) [last unloaded: crashtest]
[ 165.553092] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.10.0-rc7+ #535
[ 165.553092] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 165.553092] task: ffff88003d90a2c0 ti: ffff88003d92e000 task.ti: ffff88003d92e000
[ 165.553092] RIP: 0010:[<ffffffff811ab0e5>] [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0
[ 165.553092] RSP: 0018:ffff88003e003988 EFLAGS: 00010206
[ 165.553092] RAX: 0000000000000000 RBX: ffff88003e1d6a20 RCX: 00000000000be841
[ 165.553092] RDX: 00000000000be801 RSI: 0000000000000000 RDI: 0000000000000001
[ 165.553092] RBP: ffff88003e0039c8 R08: 00000000001d6a20 R09: 0000000000000000
[ 165.553092] R10: 0000000000000000 R11: 0000000000000001 R12: 7878787878787878
[ 165.553092] R13: 0000000000010220 R14: 0000000000000240 R15: ffff88003d801780
[ 165.553092] FS: 0000000000000000(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
[ 165.553092] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 165.553092] CR2: 00000000081ab008 CR3: 0000000037dc8000 CR4: 00000000000006e0
[ 165.553092] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 165.553092] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 165.553092] Stack:
[ 165.553092] 00000000000be801 ffff88003d92ffd8 ffffffff8161683d ffff880034e3f300
[ 165.553092] ffff88003e003a17 0000000000000020 0000000000000240 0000000000000000
[ 165.553092] ffff88003e003a00 ffffffff8161433c ffff880034e3f300 0000000000000020
...
...
...
Andrea Righi - andrea@betterlinux.com
...
[ 165.553092] Call Trace:
[ 165.553092] <IRQ>
[ 165.553092] [<ffffffff8161683d>] ? __alloc_skb+0x7d/0x290
[ 165.553092] [<ffffffff8161433c>] __kmalloc_reserve.isra.52+0x3c/0xa0
[ 165.553092] [<ffffffff8161683d>] __alloc_skb+0x7d/0x290
[ 165.553092] [<ffffffff81677e5b>] tcp_send_ack+0x3b/0xf0
[ 165.553092] [<ffffffff8166ab1e>] __tcp_ack_snd_check+0x5e/0xa0
[ 165.553092] [<ffffffff81671c64>] tcp_rcv_established+0x204/0x6f0
[ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40
[ 165.553092] [<ffffffff8167c681>] tcp_v4_do_rcv+0x161/0x360
[ 165.553092] [<ffffffff816fea39>] ? _raw_spin_lock_nested+0x79/0x90
[ 165.553092] [<ffffffff8167dc91>] tcp_v4_rcv+0x731/0x980
[ 165.553092] [<ffffffff810e706f>] ? __lock_is_held+0x5f/0x80
[ 165.553092] [<ffffffff816563d8>] ip_local_deliver_finish+0xc8/0x2f0
[ 165.553092] [<ffffffff8165635a>] ? ip_local_deliver_finish+0x4a/0x2f0
[ 165.553092] [<ffffffff81656e77>] ip_local_deliver+0x47/0x80
[ 165.553092] [<ffffffff81656740>] ip_rcv_finish+0x140/0x5e0
[ 165.553092] [<ffffffff816570e3>] ip_rcv+0x233/0x380
[ 165.553092] [<ffffffff81626062>] __netif_receive_skb_core+0x6a2/0x970
[ 165.553092] [<ffffffff81625a10>] ? __netif_receive_skb_core+0x50/0x970
[ 165.553092] [<ffffffff81626351>] __netif_receive_skb+0x21/0x70
[ 165.553092] [<ffffffff81626563>] netif_receive_skb+0x23/0x1f0
[ 165.553092] [<ffffffff81627448>] napi_gro_receive+0x98/0xd0
[ 165.553092] [<ffffffff81565c5a>] e1000_clean_rx_irq+0x18a/0x520
[ 165.553092] [<ffffffff81567451>] e1000_clean+0x251/0x910
[ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40
[ 165.553092] [<ffffffff810e6df4>] ? lock_release_holdtime.part.27+0xd4/0x160
[ 165.553092] [<ffffffff81627015>] net_rx_action+0xd5/0x2e0
[ 165.553092] [<ffffffff81088d17>] __do_softirq+0xf7/0x420
[ 165.553092] [<ffffffff810891d5>] irq_exit+0xb5/0xc0
[ 165.553092] [<ffffffff81709303>] do_IRQ+0x63/0xd0
[ 165.553092] Code: c8 48 8b 55 c0 48 8b 81 38 e0 ff ff a8 08 0f 85 5f 01 00 00 4c 8b 23 4d 85 e4 0f 84 15
01 00 00 49 63 47 20 48 8d 4a 40 4d 8b 07 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 97 49 63
[ 165.553092] RIP [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0
[ 165.553092] RSP <ffff88003e003988>
[ 165.553092] ---[ end trace baac76a23c6da73c ]---
[ 165.553092] Kernel panic - not syncing: Fatal exception in interrupt
Andrea Righi - andrea@betterlinux.com
Kernel oops
● A message is displayed in the log when a
recoverable error has occurred in kernel space
● Example: access a bad address (i.e., NULL pointer
dereference)
● An oops does not mean the system has crashed
● Current process is killed
● Oops message is displayed along with a registers
dump and a stack trace
Andrea Righi - andrea@betterlinux.com
[ 75.962412] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 75.963046] IP: [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest]
[ 75.963046] PGD 3a78d067 PUD 362be067 PMD 0
[ 75.963046] Oops: 0002 [#1] PREEMPT SMP
[ 75.963046] Modules linked in: crashtest(O)
[ 75.963046] CPU: 0 PID: 1587 Comm: bash Tainted: G O 3.10.0-rc7+ #535
[ 75.963046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 75.963046] task: ffff88003a7ec580 ti: ffff8800362f6000 task.ti: ffff8800362f6000
[ 75.963046] RIP: 0010:[<ffffffffa00003c6>] [<ffffffffa00003c6>] procfs_write+0x2d6/0x320
[crashtest]
[ 75.963046] RSP: 0018:ffff8800362f7e78 EFLAGS: 00010297
[ 75.963046] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000004e
[ 75.963046] RDX: 0000000000000000 RSI: ffffffffa0000469 RDI: ffff8800362f7eaa
[ 75.963046] RBP: ffff8800362f7ee0 R08: 0000000000000000 R09: 0000000000000000
[ 75.963046] R10: ffff88003a7ec580 R11: 0000000000000000 R12: 0000000000000003
[ 75.963046] R13: 000000000000000a R14: ffff8800362f7f50 R15: 0000000000000000
[ 75.963046] FS: 0000000000000000(0000) GS:ffff88003de00000(0063) knlGS:00000000f75f76c0
[ 75.963046] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 75.963046] CR2: 0000000000000000 CR3: 0000000036209000 CR4: 00000000000006f0
[ 75.963046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 75.963046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 75.963046] Stack:
[ 75.963046] ffffffff811b66cb 0000000000000000 0000000000000000 ffff88003a7ec580
[ 75.963046] ffff8800362f7ec8 4f49545045435845 000000000000004e 0000000000000000
[ 75.963046] 0000000000000000 00000000463b9fa0 ffff8800362fd300 000000000000000a
[ 75.963046] Call Trace:
[ 75.963046] [<ffffffff811b66cb>] ? vfs_write+0x1bb/0x1f0
[ 75.963046] [<ffffffff8121a86d>] proc_reg_write+0x3d/0x80
[ 75.963046] [<ffffffff811b65d8>] vfs_write+0xc8/0x1f0
[ 75.963046] [<ffffffff811b6ad5>] SyS_write+0x55/0xa0
[ 75.963046] [<ffffffff81708ce5>] sysenter_dispatch+0x7/0x1f
[ 75.963046] [<ffffffff813c50ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 75.963046] Code: e1 f3 6f e1 48 c7 c7 60 09 00 a0 e8 d5 f3 6f e1 e9 e2 fd ff ff c7 45 d0 78 56
34 12 e9 d6 fd ff ff e8 bf fc ff ff e9 cc fd ff ff <c7> 04 25 00 00 00 00 00 00 00 00 e9 bc fd ff ff
eb fe 66 c7 07
[ 75.963046] RIP [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest]
[ 75.963046] RSP <ffff8800362f7e78>
[ 75.963046] CR2: 0000000000000000
Andrea Righi - andrea@betterlinux.com
Taxonomy of kernel faults
●
panic(“have a nice day... ;-)”)
●
BUG() / BUG_ON(condition)
●
exception (i.e., invalid opcode, division by zero, ...)
●
memory corruption
●
stack overflow/underflow
– NOTE: in kernel space stack size is limited to 2 pages (8K in almost all architectures)
●
write after free
●
write to a bad address
●
concurrent access without protections (locks, etc.)
●
soft lockup
●
lock a CPU without giving other tasks a chance to run
●
hard lockup
●
lock a CPU without giving other tasks or interrupts a chance to run
●
hung task: task doesn't get a chance to run for more than N seconds
●
scheduling while atomic
●
deadlock
●
use FPU registers in kernel space
Andrea Righi - andrea@betterlinux.com
Useful debugging kernel options
● Kernel Hacking section ->
● CONFIG_KALLSYMS_ALL: print function names instead of addresses in kernel
messages
● CONFIG_FRAME_POINTER: get useful stack info in case of kernel bugs
● CONFIG_DEBUG_ATOMIC_SLEEP: enable sleep inside atomic section checks
(i.e., sleep from interrupt handler, sleep when a lock is held, etc...)
● CONFIG_LOCKUP_DETECTOR: detect hard and soft lockups
● CONFIG_LOCKDEP: lock dependency enging (deadlock detection)
● CONFIG_DYNAMIC_FTRACE: enable individual function tracing dynamically
(via debugfs /sys/kernel/debug/tracing)
Andrea Righi - andrea@betterlinux.com
Debugging techniques
● blinking LED
● printk()
● procfs
● SysReq key (Documentation/sysrq.txt)
● function instrumentation (kprobes)
● dynamic ftrace (CONFIG_DYNAMIC_FTRACE)
● debugger (kgdb)
Andrea Righi - andrea@betterlinux.com
printk()
● Advantages
● easy to use
● no need any other system support
● Disadvantages
● have to modify and rebuild kernel/modules
● no interactive debugging
Andrea Righi - andrea@betterlinux.com
printk(): levels
● printk levels
● KERN_EMERG: system is unusable
● KERN_ALERT: action must be taken immediately
● KERN_CRIT: critical condition
● KERN_ERR: error condition
● KERN_WARNING: warning condition
● KERN_NOTICE: normal condition
● KERN_INFO: informational
● KERN_DEBUG: debug message
● Show kernel messages:
# dmesg
● Redirect all kernel messages to the console
# echo 8 > /proc/sys/kernel/printk
●
Andrea Righi - andrea@betterlinux.com
procfsstatic int procfs_read(struct seq_file *m, void *v)
{
...
}
static ssize_t procfs_write(struct file *file,
const char __user *ubuf, size_t count, loff_t *pos)
{
...
}
static int procfs_open(struct inode *inode, struct file *file)
{
return single_open(file, procfs_read, NULL);
}
static int procfs_release(struct inode *inode, struct file *file)
{
return 0;
}
static const struct file_operations procfs_fops = {
.open = procfs_open,
.read = seq_read,
.write = procfs_write,
.llseek = seq_lseek,
.release = procfs_release,
};
static int __init myproc_init(void)
{
if (!proc_create(“myproc”, 0666, NULL, &procfs_fops))
return -ENOMEM;
return 0;
}
static void __exit myproc_exit(void)
{
remove_proc_entry(“myproc”, NULL);
}
Andrea Righi - andrea@betterlinux.com
Kprobes (Kernel probes)
● Kprobes allow to dynamically break into any kernel routine and collect
debugging and performance information (CONFIG_KPROBES=y)
● Trap almost every kernel code address, specifying a handler routine to be
invoked when the breakpoint is hit
● How does it work?
● Make a copy of the probed instruction and replace the original instruction with a
breakpoint instruction (int3 on x86)
● When the breakpoint is hit, a trap occurs, CPU's registers are saved and the
control passes to the Kprobes pre-handler
● The saved instruction is executed in single-step mode
● The Kprobes post-handler is executed
● The rest of the original function is executed
Andrea Righi - andrea@betterlinux.com
Kprobes (example)
static int my_handler(struct kprobe *p, struct pt_regs *regs)
{
/* Do something here... */
}
static struct kprobe my_kp = {
.pre_handler = my_wrapper,
.symbol_name = “schedule_timeout”,
};
static int __init my_kprobe_init(void)
{
int ret;
ret = register_kprobe(&my_kp);
if (ret < 0) {
printk(KERN_INFO "%s: error %dn", __func__, ret);
return ret;
}
return 0;
}
static void __exit my_kprobe_exit(void)
{
unregister_kprobe(&my_kp);
}
Andrea Righi - andrea@betterlinux.com
Dump a stack trace
static const char function_name[] = "schedule_timeout";
static int my_handler(struct kprobe *p, struct pt_regs *regs)
{
dump_stack();
printk(KERN_INFO "%s called %s(%d)n",
current->comm, function_name, (int)regs->di);
}
static struct kprobe my_kp = {
.pre_handler = my_wrapper,
.symbol_name = function_name,
};
static int __init my_kprobe_init(void)
{
int ret;
ret = register_kprobe(&my_kp);
if (ret < 0) {
printk(KERN_INFO "%s: error %dn", __func__, ret);
return ret;
}
return 0;
}
static void __exit my_kprobe_exit(void)
{
unregister_kprobe(&my_kp);
}
Andrea Righi - andrea@betterlinux.com
Dynamic ftrace
# mount -t debufs none /sys/kernel/debug
# cd /sys/kernel/debug
# echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter
# echo function > current_tracer
# echo 1 > tracing_on
# usleep 1
# echo 0 > tracing_on
# cat trace
# tracer: function
#
# entries-in-buffer/entries-written: 5/5 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath
<idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt
usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
<idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
<idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt
Andrea Righi - andrea@betterlinux.com
KGDB + QEMU
$ kvm -m 1024 -smp 4 -drive file=debian-6-i386.img -vnc :1 -redir tcp:5190:10.0.2.15:22
-kernel /src/linux/arch/x86/boot/bzImage -append "root=/dev/sda1 kgdbwait kgdboc=ttyS0"
-serial pty
char device redirected to /dev/pts/3 (label serial0)
$ gdb vmlinux
(gdb) target remote /dev/pts/3
● Setting up kgdb using kvm/qemu
Andrea Righi - andrea@betterlinux.com
Debugging workqueues
● workqueue: asynchronous process execution context
● kworkers are going crazy (using too much cpu)?
● Something being scheduled in rapid succession
● A single work item consumes alots of cpu cycles
● How to debug?
● kernel tracepoints:
– echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
● kworker stack trace:
– cat /proc/THE_OFFENDING_KWORKER/stack
root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1]
root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2]
root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0]
root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0]
Andrea Righi - andrea@betterlinux.com
References
● J. Corbet, A. Rubini, G. Kroah-Hartman:
Linux Device Drivers 3rd Edition
● Linux documentation
● http://lxr.linux.no/linux/Documentation/trace
● http://lxr.linux.no/linux/Documentation/kprobes.txt
● Linux weekly news: http://lwn.net
Andrea Righi - andrea@betterlinux.com
Q/A
● You're very welcome!
● Twitter
● @arighi
● #bem2013

Más contenido relacionado

La actualidad más candente

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovPivorak MeetUp
 
Advanced Diagnostics 2
Advanced Diagnostics 2Advanced Diagnostics 2
Advanced Diagnostics 2Aero Plane
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenLex Yu
 
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling ToolsTIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling ToolsXiaozhe Wang
 
Troubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device DriversTroubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device DriversSatpal Parmar
 
Performance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux KernelPerformance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux Kernellcplcp1
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisAnne Nicolas
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemCyber Security Alliance
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelSUSE Labs Taipei
 
Kernel_Crash_Dump_Analysis
Kernel_Crash_Dump_AnalysisKernel_Crash_Dump_Analysis
Kernel_Crash_Dump_AnalysisBuland Singh
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and InsightsGlobalLogic Ukraine
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
 
Linux Timer device driver
Linux Timer device driverLinux Timer device driver
Linux Timer device driver艾鍗科技
 
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)Simen Li
 
ARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreGlobalLogic Ukraine
 
Solaris DTrace, An Introduction
Solaris DTrace, An IntroductionSolaris DTrace, An Introduction
Solaris DTrace, An Introductionsatyajit_t
 
Crash Dump Analysis 101
Crash Dump Analysis 101Crash Dump Analysis 101
Crash Dump Analysis 101John Howard
 

La actualidad más candente (20)

Linux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene PirogovLinux Tracing Superpowers by Eugene Pirogov
Linux Tracing Superpowers by Eugene Pirogov
 
Advanced Diagnostics 2
Advanced Diagnostics 2Advanced Diagnostics 2
Advanced Diagnostics 2
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling ToolsTIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
 
Troubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device DriversTroubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device Drivers
 
Performance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux KernelPerformance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux Kernel
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driver
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux Kernel
 
Kernel_Crash_Dump_Analysis
Kernel_Crash_Dump_AnalysisKernel_Crash_Dump_Analysis
Kernel_Crash_Dump_Analysis
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architecture
 
Linux Timer device driver
Linux Timer device driverLinux Timer device driver
Linux Timer device driver
 
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
 
ARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/SpectreARM Architecture and Meltdown/Spectre
ARM Architecture and Meltdown/Spectre
 
Solaris DTrace, An Introduction
Solaris DTrace, An IntroductionSolaris DTrace, An Introduction
Solaris DTrace, An Introduction
 
Crash Dump Analysis 101
Crash Dump Analysis 101Crash Dump Analysis 101
Crash Dump Analysis 101
 
Linux Kernel Debugging
Linux Kernel DebuggingLinux Kernel Debugging
Linux Kernel Debugging
 

Similar a Debugging linux

Kernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesKernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesAnne Nicolas
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)yang firo
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel CrashdumpMarian Marinov
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and moreBrendan Gregg
 
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Jagadisha Maiya
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assemblyMarian Marinov
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPFIvan Babrou
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbgArno Huetter
 
Symbolic Debugging with DWARF
Symbolic Debugging with DWARFSymbolic Debugging with DWARF
Symbolic Debugging with DWARFSamy Bahra
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceBrendan Gregg
 
Fundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisFundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisDmitry Vostokov
 
Fundamentals of Physical Memory Analysis
Fundamentals of Physical Memory AnalysisFundamentals of Physical Memory Analysis
Fundamentals of Physical Memory AnalysisDmitry Vostokov
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기Ji Hun Kim
 
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root44CON
 
Humantalk Angers 14 Mars
Humantalk Angers 14 MarsHumantalk Angers 14 Mars
Humantalk Angers 14 MarsRémi Dubois
 

Similar a Debugging linux (20)

Kernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesKernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering Oopsies
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)
 
Debugging 2013- Jesper Brouer
Debugging 2013- Jesper BrouerDebugging 2013- Jesper Brouer
Debugging 2013- Jesper Brouer
 
Debugging TV Frame 0x12
Debugging TV Frame 0x12Debugging TV Frame 0x12
Debugging TV Frame 0x12
 
Linux boot-time
Linux boot-timeLinux boot-time
Linux boot-time
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
Quic illustrated
Quic illustratedQuic illustrated
Quic illustrated
 
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assembly
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbg
 
Symbolic Debugging with DWARF
Symbolic Debugging with DWARFSymbolic Debugging with DWARF
Symbolic Debugging with DWARF
 
test
testtest
test
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
Fundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisFundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump Analysis
 
Fundamentals of Physical Memory Analysis
Fundamentals of Physical Memory AnalysisFundamentals of Physical Memory Analysis
Fundamentals of Physical Memory Analysis
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기
 
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
 
Humantalk Angers 14 Mars
Humantalk Angers 14 MarsHumantalk Angers 14 Mars
Humantalk Angers 14 Mars
 

Más de Andrea Righi

Eco-friendly Linux kernel development
Eco-friendly Linux kernel developmentEco-friendly Linux kernel development
Eco-friendly Linux kernel developmentAndrea Righi
 
Linux kernel bug hunting
Linux kernel bug huntingLinux kernel bug hunting
Linux kernel bug huntingAndrea Righi
 
Kernel bug hunting
Kernel bug huntingKernel bug hunting
Kernel bug huntingAndrea Righi
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitAndrea Righi
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudAndrea Righi
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/OAndrea Righi
 

Más de Andrea Righi (6)

Eco-friendly Linux kernel development
Eco-friendly Linux kernel developmentEco-friendly Linux kernel development
Eco-friendly Linux kernel development
 
Linux kernel bug hunting
Linux kernel bug huntingLinux kernel bug hunting
Linux kernel bug hunting
 
Kernel bug hunting
Kernel bug huntingKernel bug hunting
Kernel bug hunting
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/O
 

Último

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Último (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Debugging linux

  • 1. Andrea Righi - andrea@betterlinux.com Tecniche di debugging nel kernel Linux
  • 2. Andrea Righi - andrea@betterlinux.com Agenda ● Overview (kernel programming) ● Kernel crash taxonomy ● Debugging techniques ● Example(s) ● Q/A
  • 3. Andrea Righi - andrea@betterlinux.com What's a kernel? ● The kernel provides an abstraction layer for the applications to use the physical hardware resources ● Kernel basic facilities ● Process management ● Memory management ● Device management ● System call interface
  • 4. Andrea Righi - andrea@betterlinux.com User space ● Good for debugging (gdb) ● Lots of user-space libraries available ● Unpredictable latency (context switch, scheduler, syscall, ...) ● Overhead ● Impossibility to fully interact with interrupt routines ● Impossibility to access certain memory address ● More difficult to share certain features with other drivers ● Reliability: user processes can be terminated upon critical system events (OOM, filesystem errors, etc.)
  • 5. Andrea Righi - andrea@betterlinux.com Kernel space ● Written in C and assembly ● No debugging tool (kgdb, UML, ...) ● Bugs can hang the entire system ● User memory is swappable, kernel memory can't be swapped out ● Kernel stack size is small (8K / 4K - THREAD_SIZE_ORDER) ● Floating point is forbidden ● Userspace libraries are not available ● Linux kernel must be portable (this is important if you consider to contribute mainstream) ● Closed source kernel modules taint the kernel
  • 6. Andrea Righi - andrea@betterlinux.com Example kernel module #include <linux/init.h> #include <linux/module.h> /* Module constructor */ static int __init hello_init(void) { printk(KERN_INFO "Hello, world!n"); return 0; } /* Module destructor */ static void __exit hello_exit(void) { printk(KERN_INFO "Goodbyen"); } module_init(hello_init); module_exit(hello_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Andrea Righi <andrea@betterlinux.com>"); MODULE_DESCRIPTION("BetterEmbedded hello world example");
  • 7. Andrea Righi - andrea@betterlinux.com Kernel problems ● Kernel panic (fatal error for the system) ● Kernel oops (non-fatal error) ● Wrong result (fatal from user's perspective)
  • 8. Andrea Righi - andrea@betterlinux.com Kernel panic ● No recovery is possible ● Example: exception in an atomic context (i.e., interrupt) ● Typically result in a system reboot (panic=N), or blinking LED or just hang
  • 9. Andrea Righi - andrea@betterlinux.com [ 165.552280] general protection fault: 0000 [#1] PREEMPT SMP [ 165.553055] Modules linked in: crashtest(O) [last unloaded: crashtest] [ 165.553092] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 3.10.0-rc7+ #535 [ 165.553092] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 165.553092] task: ffff88003d90a2c0 ti: ffff88003d92e000 task.ti: ffff88003d92e000 [ 165.553092] RIP: 0010:[<ffffffff811ab0e5>] [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0 [ 165.553092] RSP: 0018:ffff88003e003988 EFLAGS: 00010206 [ 165.553092] RAX: 0000000000000000 RBX: ffff88003e1d6a20 RCX: 00000000000be841 [ 165.553092] RDX: 00000000000be801 RSI: 0000000000000000 RDI: 0000000000000001 [ 165.553092] RBP: ffff88003e0039c8 R08: 00000000001d6a20 R09: 0000000000000000 [ 165.553092] R10: 0000000000000000 R11: 0000000000000001 R12: 7878787878787878 [ 165.553092] R13: 0000000000010220 R14: 0000000000000240 R15: ffff88003d801780 [ 165.553092] FS: 0000000000000000(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000 [ 165.553092] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 165.553092] CR2: 00000000081ab008 CR3: 0000000037dc8000 CR4: 00000000000006e0 [ 165.553092] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 165.553092] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 165.553092] Stack: [ 165.553092] 00000000000be801 ffff88003d92ffd8 ffffffff8161683d ffff880034e3f300 [ 165.553092] ffff88003e003a17 0000000000000020 0000000000000240 0000000000000000 [ 165.553092] ffff88003e003a00 ffffffff8161433c ffff880034e3f300 0000000000000020 ... ... ...
  • 10. Andrea Righi - andrea@betterlinux.com ... [ 165.553092] Call Trace: [ 165.553092] <IRQ> [ 165.553092] [<ffffffff8161683d>] ? __alloc_skb+0x7d/0x290 [ 165.553092] [<ffffffff8161433c>] __kmalloc_reserve.isra.52+0x3c/0xa0 [ 165.553092] [<ffffffff8161683d>] __alloc_skb+0x7d/0x290 [ 165.553092] [<ffffffff81677e5b>] tcp_send_ack+0x3b/0xf0 [ 165.553092] [<ffffffff8166ab1e>] __tcp_ack_snd_check+0x5e/0xa0 [ 165.553092] [<ffffffff81671c64>] tcp_rcv_established+0x204/0x6f0 [ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40 [ 165.553092] [<ffffffff8167c681>] tcp_v4_do_rcv+0x161/0x360 [ 165.553092] [<ffffffff816fea39>] ? _raw_spin_lock_nested+0x79/0x90 [ 165.553092] [<ffffffff8167dc91>] tcp_v4_rcv+0x731/0x980 [ 165.553092] [<ffffffff810e706f>] ? __lock_is_held+0x5f/0x80 [ 165.553092] [<ffffffff816563d8>] ip_local_deliver_finish+0xc8/0x2f0 [ 165.553092] [<ffffffff8165635a>] ? ip_local_deliver_finish+0x4a/0x2f0 [ 165.553092] [<ffffffff81656e77>] ip_local_deliver+0x47/0x80 [ 165.553092] [<ffffffff81656740>] ip_rcv_finish+0x140/0x5e0 [ 165.553092] [<ffffffff816570e3>] ip_rcv+0x233/0x380 [ 165.553092] [<ffffffff81626062>] __netif_receive_skb_core+0x6a2/0x970 [ 165.553092] [<ffffffff81625a10>] ? __netif_receive_skb_core+0x50/0x970 [ 165.553092] [<ffffffff81626351>] __netif_receive_skb+0x21/0x70 [ 165.553092] [<ffffffff81626563>] netif_receive_skb+0x23/0x1f0 [ 165.553092] [<ffffffff81627448>] napi_gro_receive+0x98/0xd0 [ 165.553092] [<ffffffff81565c5a>] e1000_clean_rx_irq+0x18a/0x520 [ 165.553092] [<ffffffff81567451>] e1000_clean+0x251/0x910 [ 165.553092] [<ffffffff810e678e>] ? put_lock_stats.isra.26+0xe/0x40 [ 165.553092] [<ffffffff810e6df4>] ? lock_release_holdtime.part.27+0xd4/0x160 [ 165.553092] [<ffffffff81627015>] net_rx_action+0xd5/0x2e0 [ 165.553092] [<ffffffff81088d17>] __do_softirq+0xf7/0x420 [ 165.553092] [<ffffffff810891d5>] irq_exit+0xb5/0xc0 [ 165.553092] [<ffffffff81709303>] do_IRQ+0x63/0xd0 [ 165.553092] Code: c8 48 8b 55 c0 48 8b 81 38 e0 ff ff a8 08 0f 85 5f 01 00 00 4c 8b 23 4d 85 e4 0f 84 15 01 00 00 49 63 47 20 48 8d 4a 40 4d 8b 07 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 97 49 63 [ 165.553092] RIP [<ffffffff811ab0e5>] __kmalloc_track_caller+0xd5/0x2b0 [ 165.553092] RSP <ffff88003e003988> [ 165.553092] ---[ end trace baac76a23c6da73c ]--- [ 165.553092] Kernel panic - not syncing: Fatal exception in interrupt
  • 11. Andrea Righi - andrea@betterlinux.com Kernel oops ● A message is displayed in the log when a recoverable error has occurred in kernel space ● Example: access a bad address (i.e., NULL pointer dereference) ● An oops does not mean the system has crashed ● Current process is killed ● Oops message is displayed along with a registers dump and a stack trace
  • 12. Andrea Righi - andrea@betterlinux.com [ 75.962412] BUG: unable to handle kernel NULL pointer dereference at (null) [ 75.963046] IP: [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] PGD 3a78d067 PUD 362be067 PMD 0 [ 75.963046] Oops: 0002 [#1] PREEMPT SMP [ 75.963046] Modules linked in: crashtest(O) [ 75.963046] CPU: 0 PID: 1587 Comm: bash Tainted: G O 3.10.0-rc7+ #535 [ 75.963046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 75.963046] task: ffff88003a7ec580 ti: ffff8800362f6000 task.ti: ffff8800362f6000 [ 75.963046] RIP: 0010:[<ffffffffa00003c6>] [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] RSP: 0018:ffff8800362f7e78 EFLAGS: 00010297 [ 75.963046] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000004e [ 75.963046] RDX: 0000000000000000 RSI: ffffffffa0000469 RDI: ffff8800362f7eaa [ 75.963046] RBP: ffff8800362f7ee0 R08: 0000000000000000 R09: 0000000000000000 [ 75.963046] R10: ffff88003a7ec580 R11: 0000000000000000 R12: 0000000000000003 [ 75.963046] R13: 000000000000000a R14: ffff8800362f7f50 R15: 0000000000000000 [ 75.963046] FS: 0000000000000000(0000) GS:ffff88003de00000(0063) knlGS:00000000f75f76c0 [ 75.963046] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 75.963046] CR2: 0000000000000000 CR3: 0000000036209000 CR4: 00000000000006f0 [ 75.963046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 75.963046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 75.963046] Stack: [ 75.963046] ffffffff811b66cb 0000000000000000 0000000000000000 ffff88003a7ec580 [ 75.963046] ffff8800362f7ec8 4f49545045435845 000000000000004e 0000000000000000 [ 75.963046] 0000000000000000 00000000463b9fa0 ffff8800362fd300 000000000000000a [ 75.963046] Call Trace: [ 75.963046] [<ffffffff811b66cb>] ? vfs_write+0x1bb/0x1f0 [ 75.963046] [<ffffffff8121a86d>] proc_reg_write+0x3d/0x80 [ 75.963046] [<ffffffff811b65d8>] vfs_write+0xc8/0x1f0 [ 75.963046] [<ffffffff811b6ad5>] SyS_write+0x55/0xa0 [ 75.963046] [<ffffffff81708ce5>] sysenter_dispatch+0x7/0x1f [ 75.963046] [<ffffffff813c50ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 75.963046] Code: e1 f3 6f e1 48 c7 c7 60 09 00 a0 e8 d5 f3 6f e1 e9 e2 fd ff ff c7 45 d0 78 56 34 12 e9 d6 fd ff ff e8 bf fc ff ff e9 cc fd ff ff <c7> 04 25 00 00 00 00 00 00 00 00 e9 bc fd ff ff eb fe 66 c7 07 [ 75.963046] RIP [<ffffffffa00003c6>] procfs_write+0x2d6/0x320 [crashtest] [ 75.963046] RSP <ffff8800362f7e78> [ 75.963046] CR2: 0000000000000000
  • 13. Andrea Righi - andrea@betterlinux.com Taxonomy of kernel faults ● panic(“have a nice day... ;-)”) ● BUG() / BUG_ON(condition) ● exception (i.e., invalid opcode, division by zero, ...) ● memory corruption ● stack overflow/underflow – NOTE: in kernel space stack size is limited to 2 pages (8K in almost all architectures) ● write after free ● write to a bad address ● concurrent access without protections (locks, etc.) ● soft lockup ● lock a CPU without giving other tasks a chance to run ● hard lockup ● lock a CPU without giving other tasks or interrupts a chance to run ● hung task: task doesn't get a chance to run for more than N seconds ● scheduling while atomic ● deadlock ● use FPU registers in kernel space
  • 14. Andrea Righi - andrea@betterlinux.com Useful debugging kernel options ● Kernel Hacking section -> ● CONFIG_KALLSYMS_ALL: print function names instead of addresses in kernel messages ● CONFIG_FRAME_POINTER: get useful stack info in case of kernel bugs ● CONFIG_DEBUG_ATOMIC_SLEEP: enable sleep inside atomic section checks (i.e., sleep from interrupt handler, sleep when a lock is held, etc...) ● CONFIG_LOCKUP_DETECTOR: detect hard and soft lockups ● CONFIG_LOCKDEP: lock dependency enging (deadlock detection) ● CONFIG_DYNAMIC_FTRACE: enable individual function tracing dynamically (via debugfs /sys/kernel/debug/tracing)
  • 15. Andrea Righi - andrea@betterlinux.com Debugging techniques ● blinking LED ● printk() ● procfs ● SysReq key (Documentation/sysrq.txt) ● function instrumentation (kprobes) ● dynamic ftrace (CONFIG_DYNAMIC_FTRACE) ● debugger (kgdb)
  • 16. Andrea Righi - andrea@betterlinux.com printk() ● Advantages ● easy to use ● no need any other system support ● Disadvantages ● have to modify and rebuild kernel/modules ● no interactive debugging
  • 17. Andrea Righi - andrea@betterlinux.com printk(): levels ● printk levels ● KERN_EMERG: system is unusable ● KERN_ALERT: action must be taken immediately ● KERN_CRIT: critical condition ● KERN_ERR: error condition ● KERN_WARNING: warning condition ● KERN_NOTICE: normal condition ● KERN_INFO: informational ● KERN_DEBUG: debug message ● Show kernel messages: # dmesg ● Redirect all kernel messages to the console # echo 8 > /proc/sys/kernel/printk ●
  • 18. Andrea Righi - andrea@betterlinux.com procfsstatic int procfs_read(struct seq_file *m, void *v) { ... } static ssize_t procfs_write(struct file *file, const char __user *ubuf, size_t count, loff_t *pos) { ... } static int procfs_open(struct inode *inode, struct file *file) { return single_open(file, procfs_read, NULL); } static int procfs_release(struct inode *inode, struct file *file) { return 0; } static const struct file_operations procfs_fops = { .open = procfs_open, .read = seq_read, .write = procfs_write, .llseek = seq_lseek, .release = procfs_release, }; static int __init myproc_init(void) { if (!proc_create(“myproc”, 0666, NULL, &procfs_fops)) return -ENOMEM; return 0; } static void __exit myproc_exit(void) { remove_proc_entry(“myproc”, NULL); }
  • 19. Andrea Righi - andrea@betterlinux.com Kprobes (Kernel probes) ● Kprobes allow to dynamically break into any kernel routine and collect debugging and performance information (CONFIG_KPROBES=y) ● Trap almost every kernel code address, specifying a handler routine to be invoked when the breakpoint is hit ● How does it work? ● Make a copy of the probed instruction and replace the original instruction with a breakpoint instruction (int3 on x86) ● When the breakpoint is hit, a trap occurs, CPU's registers are saved and the control passes to the Kprobes pre-handler ● The saved instruction is executed in single-step mode ● The Kprobes post-handler is executed ● The rest of the original function is executed
  • 20. Andrea Righi - andrea@betterlinux.com Kprobes (example) static int my_handler(struct kprobe *p, struct pt_regs *regs) { /* Do something here... */ } static struct kprobe my_kp = { .pre_handler = my_wrapper, .symbol_name = “schedule_timeout”, }; static int __init my_kprobe_init(void) { int ret; ret = register_kprobe(&my_kp); if (ret < 0) { printk(KERN_INFO "%s: error %dn", __func__, ret); return ret; } return 0; } static void __exit my_kprobe_exit(void) { unregister_kprobe(&my_kp); }
  • 21. Andrea Righi - andrea@betterlinux.com Dump a stack trace static const char function_name[] = "schedule_timeout"; static int my_handler(struct kprobe *p, struct pt_regs *regs) { dump_stack(); printk(KERN_INFO "%s called %s(%d)n", current->comm, function_name, (int)regs->di); } static struct kprobe my_kp = { .pre_handler = my_wrapper, .symbol_name = function_name, }; static int __init my_kprobe_init(void) { int ret; ret = register_kprobe(&my_kp); if (ret < 0) { printk(KERN_INFO "%s: error %dn", __func__, ret); return ret; } return 0; } static void __exit my_kprobe_exit(void) { unregister_kprobe(&my_kp); }
  • 22. Andrea Righi - andrea@betterlinux.com Dynamic ftrace # mount -t debufs none /sys/kernel/debug # cd /sys/kernel/debug # echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter # echo function > current_tracer # echo 1 > tracing_on # usleep 1 # echo 0 > tracing_on # cat trace # tracer: function # # entries-in-buffer/entries-written: 5/5 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath <idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt <idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt <idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt
  • 23. Andrea Righi - andrea@betterlinux.com KGDB + QEMU $ kvm -m 1024 -smp 4 -drive file=debian-6-i386.img -vnc :1 -redir tcp:5190:10.0.2.15:22 -kernel /src/linux/arch/x86/boot/bzImage -append "root=/dev/sda1 kgdbwait kgdboc=ttyS0" -serial pty char device redirected to /dev/pts/3 (label serial0) $ gdb vmlinux (gdb) target remote /dev/pts/3 ● Setting up kgdb using kvm/qemu
  • 24. Andrea Righi - andrea@betterlinux.com Debugging workqueues ● workqueue: asynchronous process execution context ● kworkers are going crazy (using too much cpu)? ● Something being scheduled in rapid succession ● A single work item consumes alots of cpu cycles ● How to debug? ● kernel tracepoints: – echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event ● kworker stack trace: – cat /proc/THE_OFFENDING_KWORKER/stack root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0]
  • 25. Andrea Righi - andrea@betterlinux.com References ● J. Corbet, A. Rubini, G. Kroah-Hartman: Linux Device Drivers 3rd Edition ● Linux documentation ● http://lxr.linux.no/linux/Documentation/trace ● http://lxr.linux.no/linux/Documentation/kprobes.txt ● Linux weekly news: http://lwn.net
  • 26. Andrea Righi - andrea@betterlinux.com Q/A ● You're very welcome! ● Twitter ● @arighi ● #bem2013