The classic PV interface has given Xen a head start in the cloud computing era. However, with the advance of hardware technology, its usage has be declining significantly. The complexity of PV ABI has been a source of security bugs and friction between Xen and other communities.
This talk tries to lay out a plan to make x86 Xen configurable to support different guests types, and then move the PV ABI to a PVH container so that it is possible to support PV workload in a more secure manner.
2. Current state of affairs
PV mode: no hardware extension needed, used in legacy
systems, useful in certain cases like running unikernel and
nested-virt without vVMX or vSVM
HVM mode: needs hardware support and QEMU for
emulation, has become the mainstream Xen VM mode
PVH mode: essentially HVM without QEMU, under
development
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 2 / 19
3. Two (big) projects
Splitting PV and HVM code: Refactor x86 hypervisor code.
Make guest supporting code configurable via Kconfig
PV ABI in PVH container: Implement a PV ABI shim. Use it
to translate PV hypercalls into PVH ones when necessary
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 3 / 19
4. Why splitting PV and HVM code?
Users can pick and choose the guest interfaces
Smaller binary, smaller attack surface
Reclaim precious address space if PV is disabled, to let Xen
support >16TB host memory more easily
Improve x86 hypervisor code base
*NOT* intending to kill PV in the hypervisor
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 4 / 19
5. Xen x86 PV memory layout
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 5 / 19
6. The conceptual map of code
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 6 / 19
7. The reality
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 7 / 19
8. Current code
do foo ( . . . )
{
/∗ . . . ∗/
i f (hvm) {
do foo hvm ( ) ;
return ;
}
/∗ l o t s of code to do foo f o r pv ∗/
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 8 / 19
9. Current code
do bar ( . . . )
{
/∗ . . . ∗/
i f (hvm) { /∗ Some code ∗/ };
i f ( pv ) { /∗ Some code ∗/ };
/∗ l o t s of code f o r common case ∗/
i f (hvm) { /∗ Some code ∗/ };
i f ( pv ) { /∗ Some code ∗/ };
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 9 / 19
10. Future code
do baz ( . . . )
{
/∗ code f o r common case ∗/
i f (hvm)
do baz hvm ( ) ;
i f ( pv )
do baz pv ( ) ;
return ;
}
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 10 / 19
11. Game plan for splitting PV and HVM code
Identify all the components that need refactoring
Dom0 builder
Domain handling code
Trap handling code
Memory management code
Guest memory accessor
...
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 11 / 19
12. Game plan for splitting PV and HVM code
Coarse-grained refactoring – mostly for PV code
Move code around
Split code into manageable trunks
Do some basic cleanups:
Use better function names
Use better coding style
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 12 / 19
13. Game plan for splitting PV and HVM code
Fine-grained refactoring – for both PV and HVM code
Abstract out a set of guest interfaces
Adjust internal interfaces between components if necessary
Fix x86 common code
Make PV and HVM configurable
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 13 / 19
14. Why PV ABI in PVH container?
Continue to support PV in a more secure manner
Have more than 128GB worth of 32bit PV guests
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 14 / 19
15. PV ABI in PVH container
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 15 / 19
16. Game plan for PV ABI in PVH container
Build the PV shim – essentially a stripped-down Xen
hypervisor
Go through all PV hypercall handlers, categorize them into the
aforementeioned groups
Further refactor PV guest supporting code: provide the ”real
PV” handlers and ”PV shim” handlers while sharing as much
code as possible
Change the build system to pull in the right objects
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 16 / 19
17. Game plan for PV ABI in PVH container
Adjust Xen toolstack
Construct a PVH guest while using the PV shim as ”firmware”
Further improvements (open questions at the moment)
Provide mechanism to parse guest kernel inside the container,
something like pvgrub
Provide mechanism to pass-through PCI devices (if that’s still
relevant)
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 17 / 19
18. Current status
Doing coarse-grained refactoring:
Dom0 builder (done)
Domain handling code (done)
Trap handling code (done)
Memory management code (doing)
Guest memory accessor
...
ETA: Some point in the future :)
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 18 / 19
19. Q&A
Budapest – July 11-13, 2017 Towards a configurable and slimmer x86 hypervisor 19 / 19