SlideShare a Scribd company logo
1 of 21
Download to read offline
Live-Updating Xen
Amit Shah <aams@amazon.com>
David Woodhouse <dwmw@amazon.com>
10. Juli 2019 © 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Live Update
• Update the running hypervisor with a new build
• Gracefully transfer running guests to the new Xen
• Guests may only notice a small pause
2
Why Do This
• AWS operates a large fleet of hosts
– Not much opportunity to reboot
●
long-running guests
– Operationally, we need to be ready to fix customer pain
• Roll out fixes
– Bug fixes
– Security fixes
• Bring new features
• Maintenance
– Reduce number of hypervisor versions needing support
• Development
– Reduce devel times by faster testing and prototyping 3
Existing Techniques
• Live Patching
– Works well, operationally proven
– Requires backporting to multiple supported hypervisor versions
– Effort required increases with patch complexity
– Recurring work for each livepatch
• Live Migration
– Guest workload-dependent
– Not applicable for all device models in use
4
Live Update
• Currently restricting to minor version updates
– e.g. 4.11.1 → 4.11.2
• Considering just hypervisor updates
– Dom0, userspace, etc., out of scope for now
• Most of this talk is request for comments
– General idea and design is being presented here
– Prototyping on these ideas started recently
– Design is deliberately fluid to incorporate feedback and various usecases
5
Terminology
• Running Xen
– Current hypervisor on a host
– The “source” in the live-update operation
• Target Xen
– New build of hypervisor
– The “target” in the live-update operation
6
General Idea
• Load Target Xen in memory,
• Initiate Live-Update
– Pause all domains,
– Mask interrupts for domains,
– Serialize domain states,
– Serialize Running Xen state,
– Jump to Target Xen,
• Target Xen takes over
– Deserialize state,
– Unpause domains,
– Unmask interrupts 7
General Idea
• Load Target Xen in memory,
• Initiate Live-Update
– Pause all domains,
– Mask interrupts for domains,
– Serialize domain states,
– Serialize Running Xen state,
– Jump to Target Xen,
• Target Xen takes over
– Deserialize state,
– Unpause domains,
– Unmask interrupts 8
Load Target Xen in Memory
• crashkernel area and kexec
• Load new Xen binary in crashkernel region
– kexec -l
9
Load Target Xen in Memory
• crashkernel area and kexec
• Load new Xen binary in crashkernel region
– kexec -l
• To load currently, we have to:
– $ zcat /boot/xen-4.12.gz > xen-4.12
– $ echo -en x3 | dd of=xen-4.12 bs=1 seek=16 conv=notrunc
– $ kexec -l xen-4.12 –append="..." --module "/boot/vmlinuz ..."
--module /boot/initramfs -d –mem-min=0x2000000
10
Solutions to Challenges in Load
• Patches merged for kexec-tools v2.0.20
– Adds multiboot2 support
– Gets us relocation support – can now load in crashkernel area
– Don’t use lowmem areas
– Can directly use the ELF binary
– From Varad Gautam
●
“[PATCH 1/2] elf: Support ELF loading with relocation”
●
“[PATCH 2/2] x86: Support multiboot2 images”
11
General Idea
• Load Target Xen in memory,
• Initiate Live-Update
– Pause all domains,
– Mask interrupts for domains,
– Serialize domain states,
– Serialize Running Xen state,
– Jump to Target Xen,
• Target Xen takes over
– Deserialize state,
– Unpause domains,
– Unmask interrupts 12
Jump to Target Xen
• Needs new hypercall, just `kexec -e` not sufficient
– As this needs to be an atomic operation with pausing dom0
• Do not drop to Real Mode
– Start in protected mode (or, later, even long mode)
– Stop using real-mode low memory
– Patches on the list from David
●
“[RFC PATCH 0/7] Clean up x86_64 boot code”
• Skip startup
13
Consume State in Target Xen
• Two ways to transfer state
– Pointer to memory region via kexec command line
– Multiboot module with state
• Deserialize state from Running Xen
– Xen state; domain state
14
General Idea
• Load Target Xen in memory,
• Initiate Live-Update
– Pause all domains,
– Mask interrupts for domains,
– Serialize domain states,
– Serialize Running Xen state,
– Jump to Target Xen,
• Target Xen takes over
– Deserialize state,
– Unpause domains,
– Unmask interrupts 15
Persisting Guest State
• We have Live Migration
– For minor version upgrades, state changes not expected
– Just slightly different from LM: migration across time, not space
• Persist memory
• Persist domain structures
• Collect state information
– domheap, page tables, start_info, shared_info_frame
16
Persisting Host State
• IOMMU state
– Mask interrupts
– DMA requests continue as normal
• Memory regions
– Xen memory, domain memory spread out
– Have to ensure to not overwrite these areas
●
And carefully relocate Target Xen later
17
Prototyping in Persisting Guest State
• Ongoing work for a PV guest
• Modified `xl save` workflow to start serialization
– Skip memory scrubbing
– Allow domain destruction
– Store pointers in well-known location
• Launch new domain
– Re-use state information from previously-destroyed domain
– See if guest continues running
• Later
– extend this to Dom0
– HVM domains
– Across kexec 18
Things to be Aware of (1/2)
• Pause time
– Should not result in guest noticing much of this activity
– A decent estimate could be “network connections don’t time out”
●
3 TCP RTT
– About 1-2 seconds OK to begin with
– Leaving memory pages in RAM, not initializing IOMMU, skipping startup – all help
• Interrupts could get lost
– May have to find a way to queue them and reinject
• Domain states
– Already-paused domains should remain paused
• Ordering of pausing/masking activities during setup phase
19
Things to be Aware of (2/2)
• Host Time: Target Xen re-initializes RTC
– This can be off compared to Running Xen
• Guest Timekeeping
– pvclock sync
• Internal state / struct changes
– Handling major version updates
– Can also sneak in for security fixes
– Thoughts for the future
●
Static annotation in source code / compile-time warnings
• Controlling capabilities per domain
– Currently, spread out: xen cmdline, global config, domain config, compile-time
– Control feature advertisements at launch based on Running Xen capabilities 20
More Information
• Discussions ongoing on IRC and devel list
• Sending out RFC patches as we write them
• Design session
• Wiki page
– https://wiki.xen.org/wiki/Live-Updating_Xen
– Links to WIP trees
– JIRA board
– General status information
21

More Related Content

What's hot

Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVM
nknytk
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
Todd Deshane
 

What's hot (20)

XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmXPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
 
OWF: Xen - Open Source Hypervisor Designed for Clouds
OWF: Xen - Open Source Hypervisor Designed for CloudsOWF: Xen - Open Source Hypervisor Designed for Clouds
OWF: Xen - Open Source Hypervisor Designed for Clouds
 
Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVM
 
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, HuaweiXPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixXPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
 
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NECXPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
 
Using and Understanding Xen4Centos
Using and Understanding Xen4CentosUsing and Understanding Xen4Centos
Using and Understanding Xen4Centos
 
Performance Tuning Xen
Performance Tuning XenPerformance Tuning Xen
Performance Tuning Xen
 
BSDcon Asia 2015: Xen on FreeBSD
BSDcon Asia 2015: Xen on FreeBSDBSDcon Asia 2015: Xen on FreeBSD
BSDcon Asia 2015: Xen on FreeBSD
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
 
KVM Tuning @ eBay
KVM Tuning @ eBayKVM Tuning @ eBay
KVM Tuning @ eBay
 
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
 
Xen Project CI for OpenStack Overview
Xen Project CI for OpenStack OverviewXen Project CI for OpenStack Overview
Xen Project CI for OpenStack Overview
 
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
 
Aplura virtualization slides
Aplura virtualization slidesAplura virtualization slides
Aplura virtualization slides
 
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell PavlicekSecuring Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
 
LFNW2014 Advanced Security Features of Xen Project Hypervisor
LFNW2014 Advanced Security Features of Xen Project HypervisorLFNW2014 Advanced Security Features of Xen Project Hypervisor
LFNW2014 Advanced Security Features of Xen Project Hypervisor
 
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleXPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
 
RunX ELCE 2020
RunX ELCE 2020RunX ELCE 2020
RunX ELCE 2020
 
Linuxcon EU : Virtualization in the Cloud featuring Xen and XCP
Linuxcon EU : Virtualization in the Cloud featuring Xen and XCPLinuxcon EU : Virtualization in the Cloud featuring Xen and XCP
Linuxcon EU : Virtualization in the Cloud featuring Xen and XCP
 

Similar to XPDSS19: Live-Updating Xen - Amit Shah & David Woodhouse, Amazon

Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
OpenVZ
 

Similar to XPDSS19: Live-Updating Xen - Amit Shah & David Woodhouse, Amazon (20)

Managing Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous FleetManaging Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous Fleet
 
Enhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory LoadsEnhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory Loads
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
 
RHEL5 XEN HandOnTraining_v0.4.pdf
RHEL5 XEN HandOnTraining_v0.4.pdfRHEL5 XEN HandOnTraining_v0.4.pdf
RHEL5 XEN HandOnTraining_v0.4.pdf
 
Kubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard wayKubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard way
 
Xen arm
Xen armXen arm
Xen arm
 
Xen arm
Xen armXen arm
Xen arm
 
VMworld 2015: Extreme Performance Series - vSphere Compute & Memory
VMworld 2015: Extreme Performance Series - vSphere Compute & MemoryVMworld 2015: Extreme Performance Series - vSphere Compute & Memory
VMworld 2015: Extreme Performance Series - vSphere Compute & Memory
 
Chen Haibo
Chen HaiboChen Haibo
Chen Haibo
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 
2017.06.19 Paul Woodward - ExploreVM VMware 101
2017.06.19   Paul Woodward - ExploreVM VMware 1012017.06.19   Paul Woodward - ExploreVM VMware 101
2017.06.19 Paul Woodward - ExploreVM VMware 101
 
Docking postgres
Docking postgresDocking postgres
Docking postgres
 
JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011
 
oVirt 3.5 Storage Features Overview
oVirt 3.5 Storage Features OverviewoVirt 3.5 Storage Features Overview
oVirt 3.5 Storage Features Overview
 
Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022Xen in Safety-Critical Systems - Critical Summit 2022
Xen in Safety-Critical Systems - Critical Summit 2022
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM – Part II
Linux on System z Optimizing Resource Utilization for Linux under z/VM – Part IILinux on System z Optimizing Resource Utilization for Linux under z/VM – Part II
Linux on System z Optimizing Resource Utilization for Linux under z/VM – Part II
 
Five common customer use cases for Virtual SAN - VMworld US / 2015
Five common customer use cases for Virtual SAN - VMworld US / 2015Five common customer use cases for Virtual SAN - VMworld US / 2015
Five common customer use cases for Virtual SAN - VMworld US / 2015
 

More from The Linux Foundation

More from The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information SecurityXPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

XPDSS19: Live-Updating Xen - Amit Shah & David Woodhouse, Amazon

  • 1. Live-Updating Xen Amit Shah <aams@amazon.com> David Woodhouse <dwmw@amazon.com> 10. Juli 2019 © 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 2. Live Update • Update the running hypervisor with a new build • Gracefully transfer running guests to the new Xen • Guests may only notice a small pause 2
  • 3. Why Do This • AWS operates a large fleet of hosts – Not much opportunity to reboot ● long-running guests – Operationally, we need to be ready to fix customer pain • Roll out fixes – Bug fixes – Security fixes • Bring new features • Maintenance – Reduce number of hypervisor versions needing support • Development – Reduce devel times by faster testing and prototyping 3
  • 4. Existing Techniques • Live Patching – Works well, operationally proven – Requires backporting to multiple supported hypervisor versions – Effort required increases with patch complexity – Recurring work for each livepatch • Live Migration – Guest workload-dependent – Not applicable for all device models in use 4
  • 5. Live Update • Currently restricting to minor version updates – e.g. 4.11.1 → 4.11.2 • Considering just hypervisor updates – Dom0, userspace, etc., out of scope for now • Most of this talk is request for comments – General idea and design is being presented here – Prototyping on these ideas started recently – Design is deliberately fluid to incorporate feedback and various usecases 5
  • 6. Terminology • Running Xen – Current hypervisor on a host – The “source” in the live-update operation • Target Xen – New build of hypervisor – The “target” in the live-update operation 6
  • 7. General Idea • Load Target Xen in memory, • Initiate Live-Update – Pause all domains, – Mask interrupts for domains, – Serialize domain states, – Serialize Running Xen state, – Jump to Target Xen, • Target Xen takes over – Deserialize state, – Unpause domains, – Unmask interrupts 7
  • 8. General Idea • Load Target Xen in memory, • Initiate Live-Update – Pause all domains, – Mask interrupts for domains, – Serialize domain states, – Serialize Running Xen state, – Jump to Target Xen, • Target Xen takes over – Deserialize state, – Unpause domains, – Unmask interrupts 8
  • 9. Load Target Xen in Memory • crashkernel area and kexec • Load new Xen binary in crashkernel region – kexec -l 9
  • 10. Load Target Xen in Memory • crashkernel area and kexec • Load new Xen binary in crashkernel region – kexec -l • To load currently, we have to: – $ zcat /boot/xen-4.12.gz > xen-4.12 – $ echo -en x3 | dd of=xen-4.12 bs=1 seek=16 conv=notrunc – $ kexec -l xen-4.12 –append="..." --module "/boot/vmlinuz ..." --module /boot/initramfs -d –mem-min=0x2000000 10
  • 11. Solutions to Challenges in Load • Patches merged for kexec-tools v2.0.20 – Adds multiboot2 support – Gets us relocation support – can now load in crashkernel area – Don’t use lowmem areas – Can directly use the ELF binary – From Varad Gautam ● “[PATCH 1/2] elf: Support ELF loading with relocation” ● “[PATCH 2/2] x86: Support multiboot2 images” 11
  • 12. General Idea • Load Target Xen in memory, • Initiate Live-Update – Pause all domains, – Mask interrupts for domains, – Serialize domain states, – Serialize Running Xen state, – Jump to Target Xen, • Target Xen takes over – Deserialize state, – Unpause domains, – Unmask interrupts 12
  • 13. Jump to Target Xen • Needs new hypercall, just `kexec -e` not sufficient – As this needs to be an atomic operation with pausing dom0 • Do not drop to Real Mode – Start in protected mode (or, later, even long mode) – Stop using real-mode low memory – Patches on the list from David ● “[RFC PATCH 0/7] Clean up x86_64 boot code” • Skip startup 13
  • 14. Consume State in Target Xen • Two ways to transfer state – Pointer to memory region via kexec command line – Multiboot module with state • Deserialize state from Running Xen – Xen state; domain state 14
  • 15. General Idea • Load Target Xen in memory, • Initiate Live-Update – Pause all domains, – Mask interrupts for domains, – Serialize domain states, – Serialize Running Xen state, – Jump to Target Xen, • Target Xen takes over – Deserialize state, – Unpause domains, – Unmask interrupts 15
  • 16. Persisting Guest State • We have Live Migration – For minor version upgrades, state changes not expected – Just slightly different from LM: migration across time, not space • Persist memory • Persist domain structures • Collect state information – domheap, page tables, start_info, shared_info_frame 16
  • 17. Persisting Host State • IOMMU state – Mask interrupts – DMA requests continue as normal • Memory regions – Xen memory, domain memory spread out – Have to ensure to not overwrite these areas ● And carefully relocate Target Xen later 17
  • 18. Prototyping in Persisting Guest State • Ongoing work for a PV guest • Modified `xl save` workflow to start serialization – Skip memory scrubbing – Allow domain destruction – Store pointers in well-known location • Launch new domain – Re-use state information from previously-destroyed domain – See if guest continues running • Later – extend this to Dom0 – HVM domains – Across kexec 18
  • 19. Things to be Aware of (1/2) • Pause time – Should not result in guest noticing much of this activity – A decent estimate could be “network connections don’t time out” ● 3 TCP RTT – About 1-2 seconds OK to begin with – Leaving memory pages in RAM, not initializing IOMMU, skipping startup – all help • Interrupts could get lost – May have to find a way to queue them and reinject • Domain states – Already-paused domains should remain paused • Ordering of pausing/masking activities during setup phase 19
  • 20. Things to be Aware of (2/2) • Host Time: Target Xen re-initializes RTC – This can be off compared to Running Xen • Guest Timekeeping – pvclock sync • Internal state / struct changes – Handling major version updates – Can also sneak in for security fixes – Thoughts for the future ● Static annotation in source code / compile-time warnings • Controlling capabilities per domain – Currently, spread out: xen cmdline, global config, domain config, compile-time – Control feature advertisements at launch based on Running Xen capabilities 20
  • 21. More Information • Discussions ongoing on IRC and devel list • Sending out RFC patches as we write them • Design session • Wiki page – https://wiki.xen.org/wiki/Live-Updating_Xen – Links to WIP trees – JIRA board – General status information 21