Application Live Migration in LAN/WAN Environment

IBM Research

Application Live Migration
in LAN/WAN Environment
Mahendra Kutare
Summer Intern (Georgia Tech)
Mentor – Sai Zeng
Manager – Milind Naphade, Chitra Dorai

1

IBM

© 2008 IBM Corporation

IBM Research

Talk Outline








2

Live Migration Overview
Live Migration in LAN
Live Migration in WAN
Gap Analysis for WAN
Live Migration Road Map
Live Migration Benchmarking for WAN
Future Work

IBM


IBM Research

Live Migration Overview


Live migration





Motivation






3

Movement of a running virtual machine from one physical host to another
Goal: Minimize the response delays to end users
Maintenance and upgrade
Load balancing
Server consolidation
Business acquisition, outsourcing
High availability

IBM


IBM Research

Live Migration Common Steps
Source Validation
Pre Migration

Destination Validation
Disk Copy

Source Machine Selection
Source VM Selection
Destination Machine Selection
Block Image Copy
Iterative Pre Copy, Suspend and Copy

Migration

Memory Copy

Network

Demand Paging
Network Traffic Redirect
Destination VM Activation

Destination VM
Validation

Destination Active VM Validation
Destination Local Device Connection

Post Migration

Destination VM Operation Resumption
Source VM Commit

Migration Complete
Discard Original VM

4

IBM


IBM Research

Live Migration in LAN


Overview




5

Targeted at cluster environment
Top players – VMware, Xen provide LAN migration capabilities
Different migration phases and their impact

IBM


IBM Research

Evaluation of Current Live Migration Solutions in LAN


VMware






Xen





Power 6 processors provide partition mobility support.
Disruption time – SAP demo less than a sec

Hyper – V




6

Command line tools – Xend, Xm
Least disruption time – Less than 250 ms

IBM System p





Suite of products to enable and support live migration.
VMotion, Storage VMotion, DRS, HA
Disruption time - Less than a sec

IBM

“Quick” motion – Clustering failover solution
No real time “live” migration
Disruption time – around 8 sec

IBM Research

VMotion Live Migration Under The Hood
ResumePre-copy memory from esx01 to esx02, Page”
Virtual machine is machineonon esx01 and
Suspend the virtual machineand “Demand
virtual machine running from esx01
Delete virtual on esx02 esx01 esx01
“Background Page” server
Copy the current configuration file to esx02
from until all memoryto bebeen Completed… copied
esx01 copy memory accesses modified memory
with and about has bitmap to esx02
VMotion system moved toto a bitmap
ongoing changes logged esx02
when Successfully successfully

Shared
Gigabit
Ethernet
Backplane

7

IBM

Shared LUN
visibility on
the SAN

CPUs of the
same type


IBM Research

IBM System P – POWER 6 Live Migration – Under the Hood

8

Armstrong, W.L., etl, IBM POWER6 partition mobility: Moving virtual servers seamlessly between physical systems
IBM


IBM Research

Live Migration Vendor Summary
Memory Phase 
Vendors

Pure Demand
Paging

Iterative Pre Copy
of Updates

All Pages Moved in
First Iteration

IBM

Yes

No

No

VMware

Yes

No

Yes

Xen

No

Yes

Yes

9

IBM


IBM Research

Live Migration Timeline and Performance Impact
Memory Phases 
Impact

Pure Demand
Paging

Iterative Pre Copy
of Updates

All pages not in first
iteration

Down Time

Less

More

-

Total Migration
Time

More

Less

More

Dedicated
Bandwidth

-

-

No

Performance
Impact After
Migration

More

Less

More

10

IBM


Memory
IBM Research
Source
Application(S)

Live Migration State Diagram (Rob Strom)
Target

.Virtual App

Common Live Migration Process Under Hood
Live

Application live

Start
Initiate
migration

Precopy

Send page

Pre-cop
ied pag
es

Send page

Start Target

Demand
Paging

Start M
e

Send page

ssage

Delayed
Page

s

11

IBM

ge
t for Pa
Reques e
Don

Application live

ge
t for Pa
Reques

Done

Application(T)

Disruption

Stopped

Machine live

Time

Machine live

Stop Source


IBM Research

Live Migration in LAN - Constraints
Source and destination hosts must be:
 Part of same datacenter.
Cluster of physical hosts in LAN
environment.
 Connected to same dedicated Gigabit
network
 Connected to the same storage
Shared storage common to both
source and target servers.
 Homogenous environment
E.g. VMware ESX servers must have
compatible CPU models.
System p: POWER6, AIX5.3 and
above

12

IBM

Control Center

VM1
Host B

Host A

Network

Disk 1

Shared SAN


IBM Research

Live Migration Capability Requirements - WAN


13

Three Requirements
 Consistency: behave as if it is as a single VM, once destination VM
starts, its file system should be consistent to the source VM
 Minimum service disruption: Migration does not degrade
performance significantly as perceived by end user
 Network connection: After IP address change, new connections are
seamlessly redirect to the new IP address at the destination.

IBM


IBM Research

Live Migration in WAN – Characteristics and Challenges
Source and destination hosts:
 Cannot share the same storage
Need to migration disk over
 IP address different
Network connectivity even if IP
address changes
 Connected to low bandwidth network
with high latency
Have impact on migration policies
 Homogenous environment
E.g. VMware ESX servers must have
compatible CPU models.
System p: POWER6, AIX5.3 and
above

14

IBM

Control Center

IP1

IP2

VM1
Host B

Host A

Network

Disk 1

Disk 2

Disk 2


IBM Research

Live Migration Capability Gaps - WAN






Network : IP address changes
 Available solutions - Mobile IP and IP tunneling
 Academic effort with focus on Xen
Memory
 Same solution as LAN
Storage
 Possible solutions as replication
 Deal with local persistent state storage
Best solution has poor performance – 68 seconds disruption: no isolation of root
causes.
References





15

Bradford, R., etl, Deutsche Telekom Lab, Live Wide-Area Migration of Virtual Machines Including Local
Persistent State
Ramkrishnan, K.K., etl, AT&T Labs-Research, Live Data Center Migration across WANs: A Robust Cooperative
Context Aware Approach
Clark C., et.al, Live Migration of Virtual Machines
Many more…

IBM


IBM Research

Live Migration Road Map
 Memory Migration
 Benchmark and evaluate existing technologies for WAN environment
 Iterative Pre copy
 Demand Paging

 Develop technologies leveraging existing solution concepts for WAN
 Storage Migration
 Benchmark and evaluate existing technologies for WAN environment.
 Xen storage migration
 Replication – Synchronous / Asynchronous


16

IBM


IBM Research

Live Migration Benchmarking in WAN






17

Focus: Memory Migration
Objective : Establish baselines for new technology development
Overview
 Investigate migration policies (e.g when to suspend source VM)
 Total migration time, disruption time
 Require base line measurement to evaluate future developed technologies
Design
 VM migration without any workload
 VM with self contained workload migration
 VM with web server, simulated backend and load generator
 VM with web server, application server, simulated backend and load
generator

IBM


IBM Research

Experimental Setup




Redhat Enterprise Linux 5 Server/Xen – Host OS(dom0)
Redhat Enterprise Linux 5 Client – Guest OS (domU)
Sample Workloads






Network Emulator




WANem - WAN characteristics

Monitoring



18

SPECjvm2008
SPECweb2005
IBM Trader 6

IBM

Workload performance
TCP Traces


IBM Research

Experiment 1 – VM Live Migration without Workload


Goal: Understand correlation between memory and total migration time

Storage
WAN
VM

VM

Hypervisor

Hypervisor

Host A

Host B
Simulated Network

Network
19

IBM


IBM Research

Experiment 1 Results – Memory vs. Total Migration Time
 Observations:
 Total migration time is linear function of memory
 There is overhead time
Intercept
Memory

Coefficients
1.86102573
0.00919992

Standard Error
t Stat
P-value Lower 95%
0.162785817 11.43236
1E-10 1.52342824
0.000141606 64.96847 1.23E-26 0.00890624

Upper 95%
Lower 95.0%
2.198623212 1.523428242
0.009493588 0.008906243

Upper 95.0%
2.198623212
0.009493588

Memory Line Fit Plot

Total migration time

25
20
15

Total migration time

10

Predicted Total
migration time

5
0
0

500

1000

1500

2000

2500

Memory

20

IBM


IBM Research

Experiment 2 - VM Live Migration with Workload



Goal: Understand impact of migration on benchmark performance
Setup: VM loaded with SPEC JVM - benchmark to evaluate
performance of JRE
 Low file I/O, no network I/O
Storage
WAN
VM

SPEC
JVM

Hypervisor

Hypervisor

Host A

Host B
Simulated Network

Network
21

IBM


IBM Research

Experiment 2 Results – Free Heap Size Vs Total Migration
Time
Derby-1Operation-migration

on benchmark performance
 Setup: VM loaded with SPEC JVM benchmark to evaluate performance
of JRE
 Low file I/O, no network I/O
 Observations:
 Performance metrics are not the
best (operations/free heap size)

200000000
free heap size-bytes

 Goal: Understand impact of migration

250000000

150000000

100000000

50000000

0
0

50000

100000

150000

200000

250000

300000

ms

Derby-1operation-1iteration
250000000

free heap size-bytes

200000000

150000000

100000000

50000000

0
0

50000

100000

150000

200000

250000

300000

350000

ms

22

IBM


IBM Research

Experiment 3 - VM with web server, simulated backend and load generator

…

Client 1

Client n

Prime Client

Host C

Host D

Storage
SPECweb
VM

WAN
Emulator

Hypervisor

Hypervisor

Host A

Host B
Simulated Network

Network
23

IBM


IBM Research

Experiment 3
 Goal: Understand the relation between service disruption time/total
migration time and network bandwidth and latency
 VM loaded with benchmark tool offering the capabilities of measuring both
SSL and non-SSL request/response performance
 Status:
 Finish SPECweb installation on blade servers and laptop clients
 Setting up of TCP trace analyzing tool for measuring TCP throughput

24

IBM


IBM Research

Future Work
 Memory Migration
 Benchmarking next steps:
 Test runs measuring TCP throughput over elapsed migration time
 Tests runs measuring total migration/disruption time Vs bandwidth for
fixed latency
 Test runs measuring total migration/disruption time Vs latency for fixed
bandwidth

 Storage Migration
 Benchmark and evaluate existing technologies for WAN environment.
 Xen storage migration
 Replication – Synchronous / Asynchronous


25

IBM


IBM Research

26

IBM


IBM Research

Live Migration Vendor Summary
Pure Demand
Paging

Iterative Pre Copy
of Updates

All Pages Moved in
First Iteration

IBM

Yes

No

No

VMware

Yes

No

Yes

Xen

No

Yes

Yes

27

IBM


IBM Research

Live Migration Technical Impact Summary
Pure Demand
Paging

Iterative Pre Copy
of Updates

Down Time

Less

More

Total Migration
Time

More

Less

Performance
Impact After
Migration

More

Less

28

IBM


IBM Research

SPECweb2005 Workload






29

Configured on Apache with PHP support.
Backend Simulator
Apache with PHP and Backend Simulator on 9.2.252.243 (blade server)
listening on port 80 and 81
Clients configured on 9.2.84.87 (laptop)
Characteristics
 Number of workload clients = 1
 Simultaneous session = 5
 Banking workload execution

IBM


IBM Research

WANem



30

Installed on 9.2.252.243 (blade server) as WANem VM (9.2.252.218)
Characteristics
 Configured to run at various bandwidth and delay values
 1.5 Mbps to 155 Mbps and other..

IBM


IBM Research

Live Migration in Xen

31

IBM


IBM Research

Live Migration in WAN

32

IBM


IBM Research

Live Migration Capability Challenges - WAN






33

Network
 Maintaining connectivity and application dependencies for end
users even if IP address changes
Memory
 Amount of data – Page size
 Network characteristics – No dedicated bandwidth, high latency
Storage
 No shared storage between source and target machines
 Size of storage
 VMs at Source machine have shared database

IBM


IBM Research

Live Migration Capability Gaps - WAN






Network : IP address changes
 Available solutions - Mobile IP and IP tunneling
 Academic effort with focus on Xen
Memory
 Same solution as LAN
Storage
 Possible solutions as replication
 Deal with local persistent state storage
Best solution has poor performance – 68 seconds disruption: no isolation of root
causes.
References





34

Bradford, R., etl, Deutsche Telekom Lab, Live Wide-Area Migration of Virtual Machines Including Local
Persistent State
Ramkrishnan, K.K., etl, AT&T Labs-Research, Live Data Center Migration across WANs: A Robust Cooperative
Context Aware Approach
Clark C., etl, Live Migration of Virtual Machines
Many more…

IBM


IBM Research

Live Migration – Why






35

Live migration
 Movement of a running virtual machine (OS, Application, etc., etc.) from
one physical host to another
 Minimizing the response delays to end users
Many reasons to do migration
 Maintenance and upgrade
 Load balancing
 Server consolidation
 Business acquisition, outsourcing
 Shut down of data center
 High availability
 Disaster recovery
 More…
Characteristics
 Frequent vs. non-frequent
 Planned vs. unplanned

IBM


IBM Research

What is Hyper-V Quick Motion


36

Quick Motion
 Renamed feature of host
clustering (available in 2006)
 Virtual server plus clustering
 Guest virtual machine can fail
over between cluster nodes (for
both planned and unplanned
downtime)
 Offer the ability to quickly move a
virtual machine from physical
server to physical server
 Benchmark: 512Mb virtual
machine can be migrated from
one server to another in about 6
seconds using 1Gb iSCSI

IBM

Copyright of http://blogs.technet.com/daven/archive/
2007/06/08/virtual-server-quick-migration.aspx


IBM Research

Live Migration – Setting the Stage






37

Live Application Migration Common Processes and Technological
Capabilities: An Overview
Live Application Migration Capability: Gap Analysis - LAN and WAN
Top players considered
 VMware - VMotion
 System p - System p Live Mobility
 Xen – XenMotion
 Microsoft – Hyper-V QuickMotion
Our approach – process centric approach
 Process breakdown: e.g. pre migration, migration, post migration
 Capability mapping to process steps
 Technologies and tools mapping to capabilities
IBM


Application Live Migration in LAN/WAN Environment

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (7)

Similar a Application Live Migration in LAN/WAN Environment

Similar a Application Live Migration in LAN/WAN Environment (20)

Último

Último (20)

Application Live Migration in LAN/WAN Environment

Notas del editor