SlideShare una empresa de Scribd logo
1 de 23
Post mortem debugging in
Embedded Linux Systems
Anton Bondarenko
Senior Software Engineer/Architect
Bosch Sensortec
Topics
● Introduction
● What is post-mortem analysis?
● Why do we need post-mortem data?
● How it could be retrieved? Problems and solutions
● How it could be analyzed?
○ Crash tool
● Examples
Introduction
● 10+ years of Embedded Linux experience
● 4 years as System engineer in Sony Mobile working with
Xperia Z to Z3 generations with focus on stability
○ Major activity was post-mortem analysis using different
methods and approaches
Post-mortem analysis
Post-mortem analysis consist of different methods to
investigate over data collected at the moment system state
become unstable
Well known solutions
● GDB with coredump
Post-mortem data
Post-mortem data may include
● RAM regions
● CPUs state
● Peripherals state
RAM
Video/GFX
Shared
memory
Why do we want post-mortem data
Live debugging
● Focused on flow control
Post-mortem debugging
● State analysis
● Single instance ● Multiple processing
● Online on target ● Offline or semi-offline
● System continues to evolve ● System state is atomic
● Limited scope ● Global scope
How it could be retrieved
Important rules to follow;
● Keep critical state information unmodified
● Collect as much as possible
Collection may happen:
● With system reset, for example in bootloader
● W/o system reset, for example kdump approach
● In Hypervisor as VM dump
Bootloader dumper
Advantages:
● Small footprint
● Handle hardware cases
Disadvantages:
● Separate drivers & tools
● Require special handling for
RAM initialization
● Intermediate boot stages
First kernel
Unexpected
system
reset
ROM
bootloader
RAM
bootloader
Disk
Network
KDump
Advantages:
● “Same” kernel
● Same utils
● Direct jump
Disadvantages:
● Requires more memory
● Memory reservation
● HW failures might not work
Hypervisor
All important information controlled
by hypervisor
RAM
Video/GFX
VM1
VM0
VMM
How it could be analyzed
Main requirement - OS and CPU architecture awareness
Tool Examples
● Lauterbach TRACE32
● Red Hat Crash
Lauterbach TRACE32
● Many supported
architectures
● Requires Linux
kernel OS
awareness library
● Support scripting
with its own script
language
● Active
maintenance
● License:
Proprietary
Red Hat Crash Utility
● Many
supported
architectures
(x86, ARM,
ARM64,
MIPS)
● Using GDB
as core
library
● Native
support for
Linux kernel
OS
● Active
maintenance
● License: GPL
Crash extensions
● Native support of plugin concept
● Few available including very promising one
○ Python scripts in Crash environment (PyScript)
● Supports symbols for whole system:
kernel+modules+userspace
● Full access to OS memory
○ User space analysis in tool directly
○ JVM stack and state analysis
Linux Kernel crash
● Possible causes
○ Many different ones
● Important information
○ Access to OS memory
Linux Kernel crash (sys)
Linux Kernel crash
Linux Kernel crash (bt -l)
Linux Kernel crash
IPC issues
● Possible causes
○ Unexpected state in complex system
● Important information
○ All involved parts of memory (both kernel and userspace)
Android
App 1
Android
Framework
Manager
Android
Framework
Service
Android
App 2
Android
Framework
Manager
?
LK Deadlock
● Possible causes
○ Wrong handling of locks
● Important information
○ Access to lock memory
Watchdog
● Possible causes
○ Interrupt handling
○ Hardware errors
○ Memory corruption
● Important information
○ CPUs registers state
○ Special traces and logging
Links
● KDump examples
https://access.redhat.com/documentation/en-
us/red_hat_enterprise_linux/6/html/deployment_guide/s1-
kdump-crash
● Crash whitepaper
https://people.redhat.com/anderson/crash_whitepaper
● Crash tool main page http://people.redhat.com/anderson/
● Crash tool sources https://github.com/crash-utility/crash

Más contenido relacionado

La actualidad más candente

Linux field-update-2015
Linux field-update-2015Linux field-update-2015
Linux field-update-2015
Chris Simmonds
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
Linaro
 
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated
Csw2017 bazhaniuk exploring_yoursystemdeeper_updatedCsw2017 bazhaniuk exploring_yoursystemdeeper_updated
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated
CanSecWest
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf Tools
emBO_Conference
 

La actualidad más candente (20)

Linux field-update-2015
Linux field-update-2015Linux field-update-2015
Linux field-update-2015
 
Diving into SWUpdate: adding new platform support in 30minutes with Yocto/OE !
Diving into SWUpdate: adding new platform support in 30minutes with Yocto/OE !Diving into SWUpdate: adding new platform support in 30minutes with Yocto/OE !
Diving into SWUpdate: adding new platform support in 30minutes with Yocto/OE !
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
Constructions Et Batiments
Constructions Et BatimentsConstructions Et Batiments
Constructions Et Batiments
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)
 
Calcul d'un hangar
Calcul d'un hangar Calcul d'un hangar
Calcul d'un hangar
 
46919779 se31009-murs-de-soutأ-nement
46919779 se31009-murs-de-soutأ-nement46919779 se31009-murs-de-soutأ-nement
46919779 se31009-murs-de-soutأ-nement
 
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated
Csw2017 bazhaniuk exploring_yoursystemdeeper_updatedCsw2017 bazhaniuk exploring_yoursystemdeeper_updated
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated
 
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Profiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf Tools
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
LCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLCA13: Power State Coordination Interface
LCA13: Power State Coordination Interface
 
Tegra 186のu-boot & Linux
Tegra 186のu-boot & LinuxTegra 186のu-boot & Linux
Tegra 186のu-boot & Linux
 
Basics of boot-loader
Basics of boot-loaderBasics of boot-loader
Basics of boot-loader
 
Block I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktraceBlock I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktrace
 
Prerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyPrerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrency
 

Similar a Post Mortem Debugging in Embedded Linux Systems

LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
Linaro
 

Similar a Post Mortem Debugging in Embedded Linux Systems (20)

Windows internals Essentials
Windows internals EssentialsWindows internals Essentials
Windows internals Essentials
 
Embedded Linux on ARM
Embedded Linux on ARMEmbedded Linux on ARM
Embedded Linux on ARM
 
LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMG
 
Embedded platform choices
Embedded platform choicesEmbedded platform choices
Embedded platform choices
 
OpenZFS code repository
OpenZFS code repositoryOpenZFS code repository
OpenZFS code repository
 
Introduction and course Details of Embedded Linux Platform Developer Training
Introduction and course Details of Embedded Linux Platform Developer TrainingIntroduction and course Details of Embedded Linux Platform Developer Training
Introduction and course Details of Embedded Linux Platform Developer Training
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
embedded-linux-120203.pdf
embedded-linux-120203.pdfembedded-linux-120203.pdf
embedded-linux-120203.pdf
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linux
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7
 
Hardware Detection Tool
Hardware Detection ToolHardware Detection Tool
Hardware Detection Tool
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3
 
Android for Embedded Linux Developers
Android for Embedded Linux DevelopersAndroid for Embedded Linux Developers
Android for Embedded Linux Developers
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
 
Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013
 
Porting Android
Porting AndroidPorting Android
Porting Android
 
Strategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and imagesStrategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and images
 
Bsdtw17: george neville neil: realities of dtrace on free-bsd
Bsdtw17: george neville neil: realities of dtrace on free-bsdBsdtw17: george neville neil: realities of dtrace on free-bsd
Bsdtw17: george neville neil: realities of dtrace on free-bsd
 

Más de GlobalLogic Ukraine

GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Ukraine
 

Más de GlobalLogic Ukraine (20)

GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
 
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
 
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
 
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxШтучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptx
 
Задачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxЗадачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptx
 
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxЩо треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
 
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
 
JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"
 
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
 
Страх і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationСтрах і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic Education
 
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
 
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
 
“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
 
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
 
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
 
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
 
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"
 
C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"
 
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Post Mortem Debugging in Embedded Linux Systems

  • 1. Post mortem debugging in Embedded Linux Systems Anton Bondarenko Senior Software Engineer/Architect Bosch Sensortec
  • 2. Topics ● Introduction ● What is post-mortem analysis? ● Why do we need post-mortem data? ● How it could be retrieved? Problems and solutions ● How it could be analyzed? ○ Crash tool ● Examples
  • 3. Introduction ● 10+ years of Embedded Linux experience ● 4 years as System engineer in Sony Mobile working with Xperia Z to Z3 generations with focus on stability ○ Major activity was post-mortem analysis using different methods and approaches
  • 4. Post-mortem analysis Post-mortem analysis consist of different methods to investigate over data collected at the moment system state become unstable Well known solutions ● GDB with coredump
  • 5. Post-mortem data Post-mortem data may include ● RAM regions ● CPUs state ● Peripherals state RAM Video/GFX Shared memory
  • 6. Why do we want post-mortem data Live debugging ● Focused on flow control Post-mortem debugging ● State analysis ● Single instance ● Multiple processing ● Online on target ● Offline or semi-offline ● System continues to evolve ● System state is atomic ● Limited scope ● Global scope
  • 7. How it could be retrieved Important rules to follow; ● Keep critical state information unmodified ● Collect as much as possible Collection may happen: ● With system reset, for example in bootloader ● W/o system reset, for example kdump approach ● In Hypervisor as VM dump
  • 8. Bootloader dumper Advantages: ● Small footprint ● Handle hardware cases Disadvantages: ● Separate drivers & tools ● Require special handling for RAM initialization ● Intermediate boot stages First kernel Unexpected system reset ROM bootloader RAM bootloader Disk Network
  • 9. KDump Advantages: ● “Same” kernel ● Same utils ● Direct jump Disadvantages: ● Requires more memory ● Memory reservation ● HW failures might not work
  • 10. Hypervisor All important information controlled by hypervisor RAM Video/GFX VM1 VM0 VMM
  • 11. How it could be analyzed Main requirement - OS and CPU architecture awareness Tool Examples ● Lauterbach TRACE32 ● Red Hat Crash
  • 12. Lauterbach TRACE32 ● Many supported architectures ● Requires Linux kernel OS awareness library ● Support scripting with its own script language ● Active maintenance ● License: Proprietary
  • 13. Red Hat Crash Utility ● Many supported architectures (x86, ARM, ARM64, MIPS) ● Using GDB as core library ● Native support for Linux kernel OS ● Active maintenance ● License: GPL
  • 14. Crash extensions ● Native support of plugin concept ● Few available including very promising one ○ Python scripts in Crash environment (PyScript) ● Supports symbols for whole system: kernel+modules+userspace ● Full access to OS memory ○ User space analysis in tool directly ○ JVM stack and state analysis
  • 15. Linux Kernel crash ● Possible causes ○ Many different ones ● Important information ○ Access to OS memory
  • 20. IPC issues ● Possible causes ○ Unexpected state in complex system ● Important information ○ All involved parts of memory (both kernel and userspace) Android App 1 Android Framework Manager Android Framework Service Android App 2 Android Framework Manager ?
  • 21. LK Deadlock ● Possible causes ○ Wrong handling of locks ● Important information ○ Access to lock memory
  • 22. Watchdog ● Possible causes ○ Interrupt handling ○ Hardware errors ○ Memory corruption ● Important information ○ CPUs registers state ○ Special traces and logging
  • 23. Links ● KDump examples https://access.redhat.com/documentation/en- us/red_hat_enterprise_linux/6/html/deployment_guide/s1- kdump-crash ● Crash whitepaper https://people.redhat.com/anderson/crash_whitepaper ● Crash tool main page http://people.redhat.com/anderson/ ● Crash tool sources https://github.com/crash-utility/crash