Evaluation of VM live migration policies on VMware, Xen, IBM System P, and Hyper-V ! Examination of critical stages of VM live migration policy as state machine and steps to optimize and improve service disruption time.
Live migration is the movement of a virtual machine from one physical host to another while continuously powered-up. When properly carried out, this process takes place without any noticeable effect from the point of view of the end user. Live migration allows an administrator to take a virtual machine offline for maintenance or upgrading without subjecting the system's users to downtime.
One of the most significant advantages of live migration is the fact that it facilitates proactive maintenance. If an imminent failure is suspected, the potential problem can be resolved before disruption of service occurs. Live migration can also be used for load balancing, in which work is shared among computers in order to optimize the utilization of available CPU resources.
Planned downtime is the easier of the two (because it's scheduled, not a surprise) and the most common. Generally, planned downtime is for hardware servicing (adding additional memory, storage or updating a BIOS) or software patching. Most people schedule this work off hours (early mornings or on weekends).
Unplanned downtime is the more difficult one, where a server is unexpectedly powered off and you want the virtual machines running on that server to automatically restart on another server without user intervention.
Resource balancing
A system does not have enough resources for the workload while another system does
Sever consolidation
Allows to move applications from individual, stand-alone servers to consolidated servers
New system deployment
A workload running on an existing system must be migrated to a new, more powerful one.
Availability requirements
When a system requires maintenance, its hosted applications must not be stopped and can be migrated to another system.
Iterative pre-copy: during the first iteration, all pages are transferred from A to B. Subsequent iterations copy only thos pages dirties during the previous transfer phase.
Stop and copy: suspend the running OS instance at A and redirect its network traffic to B. As described earlier, CPU state and any remaining iconsistent memory pages are then transferred. At the end of this stage there is a consistent suspended copy of he VM at both A and B. The copy at A is stil considered to be primary and is resumed in case of failure.
Source validation and destination validation is the step to make sure the migraiton can happen successfully, for instance, we have to check what’s inside source machine we want to migration, what’s the configuration at source interns of hardware all the way to the software, on the destination, we need to check the compatibility of the destinaation machine host os and hypervisor, also the available resources etc.
block image copy is the process the system copies the VM’s disck image from the source to the destination while the fil eis in use by the running VM at the source. It worth noting that the data transferred durng migration does not need to be the entrie file system used by the VM. Some technology, such as xenoserver can have template disk image, which allows tranfereing only the diffference between the template and the customized disk image
In the past months, we have carefully studied the migration process of vmware, xen and system p, finally, we summarized the common steps among all these top players, which can can server as reference to help us to develop our own live migration processes.
Live migration is the movement of a virtual machine from one physical host to another while continuously powered-up. When properly carried out, this process takes place without any noticeable effect from the point of view of the end user. Live migration allows an administrator to take a virtual machine offline for maintenance or upgrading without subjecting the system's users to downtime.
One of the most significant advantages of live migration is the fact that it facilitates proactive maintenance. If an imminent failure is suspected, the potential problem can be resolved before disruption of service occurs. Live migration can also be used for load balancing, in which work is shared among computers in order to optimize the utilization of available CPU resources.
Planned downtime is the easier of the two (because it's scheduled, not a surprise) and the most common. Generally, planned downtime is for hardware servicing (adding additional memory, storage or updating a BIOS) or software patching. Most people schedule this work off hours (early mornings or on weekends).
Unplanned downtime is the more difficult one, where a server is unexpectedly powered off and you want the virtual machines running on that server to automatically restart on another server without user intervention.
Resource balancing
A system does not have enough resources for the workload while another system does
Sever consolidation
Allows to move applications from individual, stand-alone servers to consolidated servers
New system deployment
A workload running on an existing system must be migrated to a new, more powerful one.
Availability requirements
When a system requires maintenance, its hosted applications must not be stopped and can be migrated to another system.
Live migration is the movement of a virtual machine from one physical host to another while continuously powered-up. When properly carried out, this process takes place without any noticeable effect from the point of view of the end user. Live migration allows an administrator to take a virtual machine offline for maintenance or upgrading without subjecting the system's users to downtime.
One of the most significant advantages of live migration is the fact that it facilitates proactive maintenance. If an imminent failure is suspected, the potential problem can be resolved before disruption of service occurs. Live migration can also be used for load balancing, in which work is shared among computers in order to optimize the utilization of available CPU resources.
Planned downtime is the easier of the two (because it's scheduled, not a surprise) and the most common. Generally, planned downtime is for hardware servicing (adding additional memory, storage or updating a BIOS) or software patching. Most people schedule this work off hours (early mornings or on weekends).
Unplanned downtime is the more difficult one, where a server is unexpectedly powered off and you want the virtual machines running on that server to automatically restart on another server without user intervention.
Resource balancing
A system does not have enough resources for the workload while another system does
Sever consolidation
Allows to move applications from individual, stand-alone servers to consolidated servers
New system deployment
A workload running on an existing system must be migrated to a new, more powerful one.
Availability requirements
When a system requires maintenance, its hosted applications must not be stopped and can be migrated to another system.
<number>
[This slide is a 7-click animation. Please practice with it before presenting.]
Let’s explore how VMotion works in more detail. First there are some important configuration requirements for VMotion:
VMotion is only supported by ESX Server hosts under VirtualCenter management.
A dedicated gigabit Ethernet network segment is needed between ESX Server hosts to accommodate the rapid data transfers performed.
The ESX Server hosts must share storage LUNs on the same SAN and the virtual disk files for the virtual machines to be migrated must be contained in those shared LUNs.
Finally, the processors on the ESX Server hosts must be of the same type. For example, VMotion from a Xeon host to an Opteron host is not supported because the processor architectures are too different.
[Click 1]
We start with virtual machine A running on host ESX01. We want to move VM A to our second host, ESX02 so that we can perform maintenance on host ESX01, but VM A has active user connections and network session which we want preserved.
[Click 2]
The VMotion migration is initiated from the VirtualCenter client or a VirtualCenter scheduled task or SDK script. The first action is to copy the VM A configuration file to host ESX02 to establish and instance of the VM on the new host. The virtual machine configuration file is simply a small text file listing the virtual machine’s properties.
[Click 3]
Next, the memory image of VM A is copied to the target host. The memory image can be quite large, so the dedicated gigabit Ethernet link required by VMotion lets that copy proceed at high speed. Immediately before the VM A memory image copy begins, VMotion redirects new memory write operations on host ESX01 to a memory bitmap which will record all VM A memory updates during the course of the VMotion migration. In that way, the full memory image is read-only and static during the VMotion operation. Because the virtual disk file for VM A is stored on a VMFS-formatted SAN LUN mounted by both ESX01 and ESX02, we don’t need to transfer that potentially very large file. The multiple access feature of the VMFS file system enables this time-saving method.
[Click 4]
Now, we suspend VM A on the source host and copy the memory bitmap to the target host. Because the bulk of VM A’s memory was copied earlier, the transfer of the memory bitmap proceeds quickly – only taking a second or two. This step is the only one in which activity is interrupted and that interruption is too short to cause connections to be dropped and is barely noticeable to users.
[Click 5]
As soon as the memory bitmap with the changes made to memory finishes copying, we resume VM A on its new home, ESX02. VMotion also sends an Address Resolution Protocol (ARP) ping packet to the production network switch to inform it that the switch port to use for VM A has changed. That preserves all the network connections to VM A. Some modified memory pages may still reside on ESX01 after VM A has resumed. When VM A needs access to those pages, VMotion will “demand page” them or transfer them as needed over to ESX02. This technique minimizes the service interruption when the memory bitmap is copied.
[Click 6]
VMotion completes the memory image transfer by background paging the remaining memory of VM A over to target host ESX02 and does a final commit of all the modified memory pages to the full VM A memory image. Now VM A is back to using its full memory image in read/write mode.
[Click 7]
The VMotion migration is now complete and we finish with a cleanup operation of deleting VM A from host ESX02.
Depending on the size of the virtual machine’s memory, it may take several minutes to complete a VMotion migration, but the multi-staged memory transfer method employed by VMotion reduces the period of actual interruption to just a second or two, and there is no noticeable downtime for the virtual machine or its applications.
Armstrong, W.L., etl, IBM POWER6 partition mobility: Moving virtual servers seamlessly between physical systems
The hypervisor keeps track of the pages that need to be
migrated in a dirty page table. All pages of the partition
are marked as dirty at the start of the migration. Pages
that have been sent are set to an effective state of read only
in the PPT and marked clean. Whenever the partition
attempts to write to one of the clean pages, it is intercepted
by the hypervisor by means of a VPM interrupt. The
hypervisor reverts that page to the dirty state. The
hypervisor then makes the page writable again and returns
control to the partition at the point of interruption.
The process of sending or resending pages to the
destination hypervisor continues until there is sufficient
partition memory state on the destination hypervisor so
that the processing of the partition can be transferred to
processors on the destination server and resume its
operation there. The source hypervisor suspends the
partition and transfers its internal processor and other
necessary state to the destination hypervisor. The source
hypervisor also sends the dirty page table to the
destination hypervisor. The destination hypervisor
receives the dirty page table and uses it to set the state of
all dirty pages to an ‘‘invalid’’ access state. The partition
is then resumed on the destination hypervisor. The source
hypervisor continues sending the remaining partition
page frames to the destination hypervisor, which marks
them as clean upon their successful arrival.
The destination hypervisor resumes the partition with
the virtual processors of the partition in VPM mode. After
the partition is resumed, any attempt by the partition to
access a page whose state is invalid causes a VPM
interrupt, which is handled by the hypervisor. The
destination hypervisor blocks the virtual processor and
then makes a high-priority ‘‘demand paging’’ request to
the source hypervisor for that page. The requested page is
sent ahead of other pages that are waiting to be
transferred to the destination hypervisor. When the
requested page arrives, the hypervisor marks the page as
‘‘valid’’ and resumes the virtual processor at the point of
interruption. This process continues transparently to the
partition until all remaining partition pages have been
transferred from the source to the destination. Once all
pages are resident on the destination server, the
destination hypervisor takes the virtual processors of the
partition out of VPM mode.
During the period of time that the partition is in VPM
mode for movement, other storage access interrupts may
occur. The source or the destination hypervisor uses the
VPT to analyze an interrupt and passes control to the OS
interrupt handler if the interrupt is not associated with
partition movement.
We have source VM and target machine. Yellow bar representing source machine is alive
Application is running inside machine. Green line representing application is running.
There is external decisiion to migrate the application, which initialive the migration process, and enter precopy state. During precopy phase, the pages has been sent in the order determined by source/control point. During this phase, the pages has been sent or in the queue to be sent can bet written again, which needs some mechanism to detect that. When destination received sufficient pages, application at source enter stopped state, which means the pages in the queue is still kept sending, however pages cannot get written. By doing this, the queue of pages can be garanteed to dinished, while in precopy phase, there is no such guarantee. While target is ready to start, a start message along with list of invalide pages to destination, informing it the pages has not been sent and the pages have sent but get writtten. After getting message, the source application starts, meantime, the delayed pages keep arriving to the destination. When appliation is alive on destination, it starts to enter demand paging phase, where if any attempt to access a page who is invlid on target, the target will make a high prioirty gets write calls to invalid pages, it will send request to source to ask sending the page with the most recent states, these requested pages are
After
the partition is resumed, any attempt by the partition to
access a page whose state is invalid causes a VPM
interrupt, which is handled by the hypervisor. The
destination hypervisor blocks the virtual processor and
then makes a high-priority ‘‘demand paging’’ request to
the source hypervisor for that page. The requested page is
sent ahead of other pages that are waiting to be
transferred to the destination hypervisor
Virtualization software support is installed and available on source.
Virtualization software support is installed and available on destination.
Source and destination VM connected to same shared storage
Source and destination VM share a dedicated gigabit network.
Source and destination VM share the same shared network.
Check for a successful VM migration
Source and target destination hosts must be:
Part of same datacenter.
Cluster of physical hosts in LAN environment.
Connected to same Gigabit network
Candidate virtual machines must not be connected to internal networks or local devices.
Connected to the same storage
Shared storage common to both source and target ESX servers.
Must have compatible CPU models.
Source and target ESX servers have the processors of same family
Virtualization software support is installed and available on source.
Virtualization software support is installed and available on destination.
Source and destination VM connected to same shared storage
Source and destination VM share a dedicated gigabit network.
Source and destination VM share the same shared network.
Check for a successful VM migration
Source and target destination hosts must be:
Part of same datacenter.
Cluster of physical hosts in LAN environment.
Connected to same Gigabit network
Candidate virtual machines must not be connected to internal networks or local devices.
Connected to the same storage
Shared storage common to both source and target ESX servers.
Must have compatible CPU models.
Source and target ESX servers have the processors of same family
Robert Bradford, Evngelos Kotsovinos, Anja Feldmann, Harald Schioberg, Deutsche Telekom Lab, “Live Wide-Area Migration of Virtual Machines Including Local Persistent State
K.K. Ramkrishnan, etl, AT&T Labs-Research, Live Data Center Migration across WANs: A Robust Cooperative Context Aware Approach
Clark C., etl, Live Migration of Virtual Machines
Network redirect
Disk copy
*Required for states stored in data stores, more relevant for WAN
Disk Snapshot
Create disk only snapshot – disk image at a particular time to create child disk
Snapshot consolidation
Consolidation of child and parent disk snapshots
Asynchronous replication
Local and remote storage systems are allowed to diverge
Virtual Machine Disk Files are not external disks. They are virtual machine configuration files which needs to be moved to other physical machine during live migrations
Disk SnapshotCreate disk only snapshot - disk image at a particular time to create child disk
REDO logsLog of disk write activities which can then be replayed to restore remote disk and keep it consistent with local disk
Snapshot consolidationConsolidation of child and parent disk snapshots after
Asynchronous ReplicationLocal and remote storage systems are allowed to diverge. The amount of divergence between local and remote copies is typically bounded by either certain amount of time or data.
Scott Trent: AIX guru (specweb wiki)
SPEC IBM representative: Alan Adamson: SWG
Virtualization efficiency: Chris Floyd and Joe Jakubowski (STG system x)
Scott Trent: AIX guru (specweb wiki)
SPEC IBM representative: Alan Adamson: SWG
Virtualization efficiency: Chris Floyd and Joe Jakubowski (STG system x)
Measuring mechanism sits inside VM, timekeeping is not trusty
Scott Trent: AIX guru (specweb wiki)
SPEC IBM representative: Alan Adamson: SWG
Virtualization efficiency: Chris Floyd and Joe Jakubowski (STG system x)
Robert Bradford, Evngelos Kotsovinos, Anja Feldmann, Harald Schioberg, Deutsche Telekom Lab, “Live Wide-Area Migration of Virtual Machines Including Local Persistent State
K.K. Ramkrishnan, etl, AT&T Labs-Research, Live Data Center Migration across WANs: A Robust Cooperative Context Aware Approach
Clark C., etl, Live Migration of Virtual Machines
Network redirect
Disk copy
*Required for states stored in data stores, more relevant for WAN
Disk Snapshot
Create disk only snapshot – disk image at a particular time to create child disk
Snapshot consolidation
Consolidation of child and parent disk snapshots
Asynchronous replication
Local and remote storage systems are allowed to diverge
Virtual Machine Disk Files are not external disks. They are virtual machine configuration files which needs to be moved to other physical machine during live migrations
Disk SnapshotCreate disk only snapshot - disk image at a particular time to create child disk
REDO logsLog of disk write activities which can then be replayed to restore remote disk and keep it consistent with local disk
Snapshot consolidationConsolidation of child and parent disk snapshots after
Asynchronous ReplicationLocal and remote storage systems are allowed to diverge. The amount of divergence between local and remote copies is typically bounded by either certain amount of time or data.
Live migration is the movement of a virtual machine from one physical host to another while continuously powered-up. When properly carried out, this process takes place without any noticeable effect from the point of view of the end user. Live migration allows an administrator to take a virtual machine offline for maintenance or upgrading without subjecting the system's users to downtime.
One of the most significant advantages of live migration is the fact that it facilitates proactive maintenance. If an imminent failure is suspected, the potential problem can be resolved before disruption of service occurs. Live migration can also be used for load balancing, in which work is shared among computers in order to optimize the utilization of available CPU resources.
Planned downtime is the easier of the two (because it's scheduled, not a surprise) and the most common. Generally, planned downtime is for hardware servicing (adding additional memory, storage or updating a BIOS) or software patching. Most people schedule this work off hours (early mornings or on weekends).
Unplanned downtime is the more difficult one, where a server is unexpectedly powered off and you want the virtual machines running on that server to automatically restart on another server without user intervention.
Resource balancing
A system does not have enough resources for the workload while another system does
Sever consolidation
Allows to move applications from individual, stand-alone servers to consolidated servers
New system deployment
A workload running on an existing system must be migrated to a new, more powerful one.
Availability requirements
When a system requires maintenance, its hosted applications must not be stopped and can be migrated to another system.
http://blog.scottlowe.org/2007/07/23/live-migration-vs-quick-migration/
Quick Migration simply saves the state of a running virtual machine (memory to disk), moves the storage connectivity from one physical server to another and then restores the virtual machine (disk to memory). This is quick (seconds) - but it will depend on how much memory needs to be written to disk and the speed of the connectivity to the storage. For your reference, a 512Mb virtual machine can be migrated from one server to another in about six seconds using 1Gb iSCSI.