SlideShare una empresa de Scribd logo
1 de 23
Post mortem debugging in
Embedded Linux Systems
Anton Bondarenko
Senior Software Engineer/Architect
Bosch Sensortec
Topics
● Introduction
● What is post-mortem analysis?
● Why do we need post-mortem data?
● How it could be retrieved? Problems and solutions
● How it could be analyzed?
○ Crash tool
● Examples
Introduction
● 10+ years of Embedded Linux experience
● 4 years as System engineer in Sony Mobile working with
Xperia Z to Z3 generations with focus on stability
○ Major activity was post-mortem analysis using different
methods and approaches
Post-mortem analysis
Post-mortem analysis consist of different methods to
investigate over data collected at the moment system state
become unstable
Well known solutions
● GDB with coredump
Post-mortem data
Post-mortem data may include
● RAM regions
● CPUs state
● Peripherals state
RAM
Video/GFX
Shared
memory
Why do we want post-mortem data
Live debugging
● Focused on flow control
Post-mortem debugging
● State analysis
● Single instance ● Multiple processing
● Online on target ● Offline or semi-offline
● System continues to evolve ● System state is atomic
● Limited scope ● Global scope
How it could be retrieved
Important rules to follow;
● Keep critical state information unmodified
● Collect as much as possible
Collection may happen:
● With system reset, for example in bootloader
● W/o system reset, for example kdump approach
● In Hypervisor as VM dump
Bootloader dumper
Advantages:
● Small footprint
● Handle hardware cases
Disadvantages:
● Separate drivers & tools
● Require special handling for
RAM initialization
● Intermediate boot stages
First kernel
Unexpected
system
reset
ROM
bootloader
RAM
bootloader
Disk
Network
KDump
Advantages:
● “Same” kernel
● Same utils
● Direct jump
Disadvantages:
● Requires more memory
● Memory reservation
● HW failures might not work
Hypervisor
All important information controlled
by hypervisor
RAM
Video/GFX
VM1
VM0
VMM
How it could be analyzed
Main requirement - OS and CPU architecture awareness
Tool Examples
● Lauterbach TRACE32
● Red Hat Crash
Lauterbach TRACE32
● Many supported
architectures
● Requires Linux
kernel OS
awareness library
● Support scripting
with its own script
language
● Active
maintenance
● License:
Proprietary
Red Hat Crash Utility
● Many
supported
architectures
(x86, ARM,
ARM64,
MIPS)
● Using GDB
as core
library
● Native
support for
Linux kernel
OS
● Active
maintenance
● License: GPL
Crash extensions
● Native support of plugin concept
● Few available including very promising one
○ Python scripts in Crash environment (PyScript)
● Supports symbols for whole system:
kernel+modules+userspace
● Full access to OS memory
○ User space analysis in tool directly
○ JVM stack and state analysis
Linux Kernel crash
● Possible causes
○ Many different ones
● Important information
○ Access to OS memory
Linux Kernel crash (sys)
Linux Kernel crash
Linux Kernel crash (bt -l)
Linux Kernel crash
IPC issues
● Possible causes
○ Unexpected state in complex system
● Important information
○ All involved parts of memory (both kernel and userspace)
Android
App 1
Android
Framework
Manager
Android
Framework
Service
Android
App 2
Android
Framework
Manager
?
LK Deadlock
● Possible causes
○ Wrong handling of locks
● Important information
○ Access to lock memory
Watchdog
● Possible causes
○ Interrupt handling
○ Hardware errors
○ Memory corruption
● Important information
○ CPUs registers state
○ Special traces and logging
Links
● KDump examples
https://access.redhat.com/documentation/en-
us/red_hat_enterprise_linux/6/html/deployment_guide/s1-
kdump-crash
● Crash whitepaper
https://people.redhat.com/anderson/crash_whitepaper
● Crash tool main page http://people.redhat.com/anderson/
● Crash tool sources https://github.com/crash-utility/crash

Más contenido relacionado

La actualidad más candente

Windows internals Essentials
Windows internals EssentialsWindows internals Essentials
Windows internals EssentialsJohn Ombagi
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programsMatthew Ahrens
 
Lt2013 uefisb.talk
Lt2013 uefisb.talkLt2013 uefisb.talk
Lt2013 uefisb.talkUdo Seidel
 
Application Performance Monitoring in Tryton
Application Performance Monitoring in TrytonApplication Performance Monitoring in Tryton
Application Performance Monitoring in TrytonNaN-tic
 
Profile all the things! - Capital Go 2017
 Profile all the things! - Capital Go 2017 Profile all the things! - Capital Go 2017
Profile all the things! - Capital Go 2017John Potocny
 
Linux Tor Browser kurulum
Linux Tor Browser kurulumLinux Tor Browser kurulum
Linux Tor Browser kurulumreso95
 
NUS SOC Print
NUS SOC PrintNUS SOC Print
NUS SOC Printyeokm1
 
A tale of two(many) proxies
A tale of two(many) proxiesA tale of two(many) proxies
A tale of two(many) proxiesMohan Dutt
 
Atomic Developer Bundle
Atomic Developer BundleAtomic Developer Bundle
Atomic Developer BundleDharmit Shah
 
Python debugging techniques
Python debugging techniquesPython debugging techniques
Python debugging techniquesTuomas Suutari
 

La actualidad más candente (15)

Windows internals Essentials
Windows internals EssentialsWindows internals Essentials
Windows internals Essentials
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programs
 
Lt2013 uefisb.talk
Lt2013 uefisb.talkLt2013 uefisb.talk
Lt2013 uefisb.talk
 
Application Performance Monitoring in Tryton
Application Performance Monitoring in TrytonApplication Performance Monitoring in Tryton
Application Performance Monitoring in Tryton
 
How We Test Linux
How We Test LinuxHow We Test Linux
How We Test Linux
 
Mesa and Its Debugging
Mesa and Its DebuggingMesa and Its Debugging
Mesa and Its Debugging
 
OpenZFS at LinuxCon
OpenZFS at LinuxConOpenZFS at LinuxCon
OpenZFS at LinuxCon
 
Profile all the things! - Capital Go 2017
 Profile all the things! - Capital Go 2017 Profile all the things! - Capital Go 2017
Profile all the things! - Capital Go 2017
 
Linux Tor Browser kurulum
Linux Tor Browser kurulumLinux Tor Browser kurulum
Linux Tor Browser kurulum
 
NUS SOC Print
NUS SOC PrintNUS SOC Print
NUS SOC Print
 
A tale of two(many) proxies
A tale of two(many) proxiesA tale of two(many) proxies
A tale of two(many) proxies
 
Git In One Evening
Git In One EveningGit In One Evening
Git In One Evening
 
Atomic Developer Bundle
Atomic Developer BundleAtomic Developer Bundle
Atomic Developer Bundle
 
Python debugging techniques
Python debugging techniquesPython debugging techniques
Python debugging techniques
 

Similar a Vpm

LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLinaro
 
Embedded platform choices
Embedded platform choicesEmbedded platform choices
Embedded platform choicesTavish Naruka
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLinaro
 
embedded-linux-120203.pdf
embedded-linux-120203.pdfembedded-linux-120203.pdf
embedded-linux-120203.pdftwtester
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linuxmountpoint.io
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Kynetics
 
Hardware Detection Tool
Hardware Detection ToolHardware Detection Tool
Hardware Detection ToolAnne Nicolas
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3Opersys inc.
 
Android for Embedded Linux Developers
Android for Embedded Linux DevelopersAndroid for Embedded Linux Developers
Android for Embedded Linux DevelopersOpersys inc.
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205Linaro
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)Linaro
 
Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013Opersys inc.
 
Strategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and imagesStrategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and imagesMender.io
 
Post mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EUPost mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EUMichael Dawson
 
Leveraging Android's Linux Heritage at ELC-E 2011
Leveraging Android's Linux Heritage at ELC-E 2011Leveraging Android's Linux Heritage at ELC-E 2011
Leveraging Android's Linux Heritage at ELC-E 2011Opersys inc.
 

Similar a Vpm (20)

Embedded Linux on ARM
Embedded Linux on ARMEmbedded Linux on ARM
Embedded Linux on ARM
 
LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMG
 
Embedded platform choices
Embedded platform choicesEmbedded platform choices
Embedded platform choices
 
Introduction and course Details of Embedded Linux Platform Developer Training
Introduction and course Details of Embedded Linux Platform Developer TrainingIntroduction and course Details of Embedded Linux Platform Developer Training
Introduction and course Details of Embedded Linux Platform Developer Training
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
embedded-linux-120203.pdf
embedded-linux-120203.pdfembedded-linux-120203.pdf
embedded-linux-120203.pdf
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linux
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7
 
Hardware Detection Tool
Hardware Detection ToolHardware Detection Tool
Hardware Detection Tool
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3Leveraging Android's Linux Heritage at AnDevCon3
Leveraging Android's Linux Heritage at AnDevCon3
 
Android for Embedded Linux Developers
Android for Embedded Linux DevelopersAndroid for Embedded Linux Developers
Android for Embedded Linux Developers
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
 
Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013Android Internals at Linaro Connect Asia 2013
Android Internals at Linaro Connect Asia 2013
 
Porting Android
Porting AndroidPorting Android
Porting Android
 
Strategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and imagesStrategies for developing and deploying your embedded applications and images
Strategies for developing and deploying your embedded applications and images
 
Post mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EUPost mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EU
 
Leveraging Android's Linux Heritage at ELC-E 2011
Leveraging Android's Linux Heritage at ELC-E 2011Leveraging Android's Linux Heritage at ELC-E 2011
Leveraging Android's Linux Heritage at ELC-E 2011
 

Más de GlobalLogic Ukraine

GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic Ukraine
 
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxШтучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxGlobalLogic Ukraine
 
Задачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxЗадачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxGlobalLogic Ukraine
 
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxЩо треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxGlobalLogic Ukraine
 
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...GlobalLogic Ukraine
 
JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"GlobalLogic Ukraine
 
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic Ukraine
 
Страх і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationСтрах і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationGlobalLogic Ukraine
 
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic Ukraine
 
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic Ukraine
 
“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Ukraine
 
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Ukraine
 
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic Ukraine
 
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"GlobalLogic Ukraine
 
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Ukraine
 
C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"GlobalLogic Ukraine
 
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...GlobalLogic Ukraine
 
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Ukraine
 
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”GlobalLogic Ukraine
 

Más de GlobalLogic Ukraine (20)

GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
 
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptxШтучний інтелект як допомога в навчанні, а не замінник.pptx
Штучний інтелект як допомога в навчанні, а не замінник.pptx
 
Задачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptxЗадачі AI-розробника як застосовується штучний інтелект.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptx
 
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptxЩо треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
 
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
 
JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"JavaScript Community Webinar #14 "Why Is Git Rebase?"
JavaScript Community Webinar #14 "Why Is Git Rebase?"
 
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
 
Страх і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic EducationСтрах і сила помилок - IT Inside від GlobalLogic Education
Страх і сила помилок - IT Inside від GlobalLogic Education
 
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
 
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”GlobalLogic QA Webinar “What does it take to become a Test Engineer”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
 
“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?“How to Secure Your Applications With a Keycloak?
“How to Secure Your Applications With a Keycloak?
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
 
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
 
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
 
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
 
GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"GlobalLogic Webinar "Introduction to Embedded QA"
GlobalLogic Webinar "Introduction to Embedded QA"
 
C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"C++ Webinar "Why Should You Learn C++ in 2021-22?"
C++ Webinar "Why Should You Learn C++ in 2021-22?"
 
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
GlobalLogic Test Automation Live Testing Session “Android Behind UI — Testing...
 
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
 
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”
GlobalLogic Azure TechTalk ONLINE “Marketing Data Lake in Azure”
 

Vpm

  • 1. Post mortem debugging in Embedded Linux Systems Anton Bondarenko Senior Software Engineer/Architect Bosch Sensortec
  • 2. Topics ● Introduction ● What is post-mortem analysis? ● Why do we need post-mortem data? ● How it could be retrieved? Problems and solutions ● How it could be analyzed? ○ Crash tool ● Examples
  • 3. Introduction ● 10+ years of Embedded Linux experience ● 4 years as System engineer in Sony Mobile working with Xperia Z to Z3 generations with focus on stability ○ Major activity was post-mortem analysis using different methods and approaches
  • 4. Post-mortem analysis Post-mortem analysis consist of different methods to investigate over data collected at the moment system state become unstable Well known solutions ● GDB with coredump
  • 5. Post-mortem data Post-mortem data may include ● RAM regions ● CPUs state ● Peripherals state RAM Video/GFX Shared memory
  • 6. Why do we want post-mortem data Live debugging ● Focused on flow control Post-mortem debugging ● State analysis ● Single instance ● Multiple processing ● Online on target ● Offline or semi-offline ● System continues to evolve ● System state is atomic ● Limited scope ● Global scope
  • 7. How it could be retrieved Important rules to follow; ● Keep critical state information unmodified ● Collect as much as possible Collection may happen: ● With system reset, for example in bootloader ● W/o system reset, for example kdump approach ● In Hypervisor as VM dump
  • 8. Bootloader dumper Advantages: ● Small footprint ● Handle hardware cases Disadvantages: ● Separate drivers & tools ● Require special handling for RAM initialization ● Intermediate boot stages First kernel Unexpected system reset ROM bootloader RAM bootloader Disk Network
  • 9. KDump Advantages: ● “Same” kernel ● Same utils ● Direct jump Disadvantages: ● Requires more memory ● Memory reservation ● HW failures might not work
  • 10. Hypervisor All important information controlled by hypervisor RAM Video/GFX VM1 VM0 VMM
  • 11. How it could be analyzed Main requirement - OS and CPU architecture awareness Tool Examples ● Lauterbach TRACE32 ● Red Hat Crash
  • 12. Lauterbach TRACE32 ● Many supported architectures ● Requires Linux kernel OS awareness library ● Support scripting with its own script language ● Active maintenance ● License: Proprietary
  • 13. Red Hat Crash Utility ● Many supported architectures (x86, ARM, ARM64, MIPS) ● Using GDB as core library ● Native support for Linux kernel OS ● Active maintenance ● License: GPL
  • 14. Crash extensions ● Native support of plugin concept ● Few available including very promising one ○ Python scripts in Crash environment (PyScript) ● Supports symbols for whole system: kernel+modules+userspace ● Full access to OS memory ○ User space analysis in tool directly ○ JVM stack and state analysis
  • 15. Linux Kernel crash ● Possible causes ○ Many different ones ● Important information ○ Access to OS memory
  • 20. IPC issues ● Possible causes ○ Unexpected state in complex system ● Important information ○ All involved parts of memory (both kernel and userspace) Android App 1 Android Framework Manager Android Framework Service Android App 2 Android Framework Manager ?
  • 21. LK Deadlock ● Possible causes ○ Wrong handling of locks ● Important information ○ Access to lock memory
  • 22. Watchdog ● Possible causes ○ Interrupt handling ○ Hardware errors ○ Memory corruption ● Important information ○ CPUs registers state ○ Special traces and logging
  • 23. Links ● KDump examples https://access.redhat.com/documentation/en- us/red_hat_enterprise_linux/6/html/deployment_guide/s1- kdump-crash ● Crash whitepaper https://people.redhat.com/anderson/crash_whitepaper ● Crash tool main page http://people.redhat.com/anderson/ ● Crash tool sources https://github.com/crash-utility/crash