2. About Aidan Finn Technical Sales Lead at MicroWarehouse Working in IT since 1996 MCSE & MVP (Virtual Machine) Experienced with Windows Server/Desktop, System Center, virtualisation, and IT infrastructure. Blog: http://www.aidanfinn.com Twitter: @joe_elway
3. Agenda Don’t know what new info you’ll get from this But at least you’ll find out what issues I’m seeing and reading about A lot of implementation issues are due to lack of education or documentation
4. Assessment “Measure twice – cut once” How can you do virtualisation without knowing what’s required? Gut feeling is insufficient MAP is a starting point I keep encountering people who don’t do assessments And strangely they have issues later on! Indicator that there will be later implementation issues Assess for as long as possible to size accurately.
5. Design Supervision ... or lack there of. Typical scenario Customer divides up the virtualisation project to many service providers Servers, storage, network, Hyper-V, VMM, OpsMgr, backup, etc Service providers can/will not cooperate No one has design oversight Things fall apart
6. Persistent Reservations Storage goes offline Number required = Hosts * CSV * Storage Channels/Host Check with storage expert Beware systems like HP P4000 Hosts have 2 channels to every node in storage cluster Solutions: Is the storage firmware up to date? Check storage design – all those CSVs required?
7. Storage Offline & Host 9e BSOD Check times of BSOD VS backup schedules If it happens at same time as CSV backup: Check the VSS provider If it is Hardware VSS provider: Check for latest version Check for vendor support of CSV backup Even with support, can be flaky H/W VSS provider May have to switch to: System VSS provider Serialized backup
8. Third Party Backup & Replication Watch out for 3rd party software storage with DR replication feature CSV backup will create snapshot on the replicated volume Will cause replication/bandwidth issues Encountered 3rd party backup with “2008 R2 Hyper-V support” Had no concept of cluster & VM placement awareness
9. Storage is Slow - Backup Storage is unexpectedly slow – Redirected Mode Check the CSV backup strategy Does it really need to be hourly? Are VMs with common backup strategy on the same CSV? Are VM VHDs placed on many CSVs? Strategy 1 CSV : 1 backup policy Infrequent CSV backup (nightly/weekly/monthly) Frequent in-VM data backup (hourly, half day, etc) Remember: the entire CSV goes into redirected mode
10. Storage is Slow - RAID Am seeing people go budget on their SAN disk to save money Slower disk at RAID5 for all CSVs They find VM storage is significantly slower than pre-P2V physical server storage Complicated with advanced storage concepts like disk groups Implementers failing to grasp that virtual requirements are the same as physical requirements
11. Storage is Slow - VHD Some still advocating that Dynamic VHD is nearly as fast as Fixed VHD True in the perfect, small, short-lived lab Not true in the real world: Fragmentation of dynamic VHD Have been told that some storage controllers don’t deal well with random nature of fragmented storage Rapid data growth leads to storage latency Dynamic VHD on CSV can cause redirected I/O to grow if VM not on the CSV coordinator
12.
13.
14. Antivirus People are not following the guidance: http://support.microsoft.com/kb/961804 They scan CSV, VHDs, config files and processes Lack of awareness The security officer told them to “or else” VMs are corrupted or disappear 0x800704C8, 0x80070037 or 0x800703E3 I hate AV on Hyper-V hosts System, manual, or update errors
15. Cluster Networking I’ve seen companies: Following W2003 or SQL 2008 cluster guidance Wasting money on an extra “cluster communications” network You really need: Parent VM CSV / Cluster Communications Live Migration * Storage 1 & Storage 2 Maybe a backup network Cable/enable network connection one by one Label each network connection according to role
16. Multi-Site Clusters That Aren’t Scenario Company has two offices near each other One will be DR for the other “Fast” 10MB+ link They tell the implementer that it is a single site Hyper-V and storage clusters are implemented as a single site cluster – but should be multi-site Split brain scenario when that link eventually fails Follow best practices: e.g. File share witness in 3rd site Active-active sites & backup: VMs & CSVs Redirected I/O across WAN link!
17. Lack of Patching Incredible number of installs with no patching & Hyper-V is blamed: iSCSI memory leaks (pre-SP1) Intel Nehalem/Westmere 1a BSODs (pre-SP1) Still have patching to do since SP1 http://social.technet.microsoft.com/wiki/contents/articles/3150.aspx Clustering for W2008 R2 SP1: http://social.technet.microsoft.com/wiki/contents/articles/list-of-cluster-hotfixes-for-windows-server-2008-r2.aspx
18. SBS as a Guest Increasingly common Seeing a growing trend with networking failures The usual suspect (KB974909) is not the solution Fix: Unknown to me! Discussed with Microsoft PFE’s: disable advanced NIC features like TOE in the host and retry
19. Linux VMs Dynamic MAC address leading to lost network access after migration Are integration components being kept up to date? Integration components not updated automatically by VMM Not quite as easy to do as with Windows guests No VSS so needs specialised backup strategy And consideration when placing on CSV
20. Snapshots Most products that matter don’t support them: AD, SQL, Exchange Beware unmerged snapshots: Not immediately obvious in the GUI Over time: fills disk, slows storage, causes app weirdness People doing silly things: Deleting AVD Changing VHD
21. NIC Teaming & Network Security We know the official line on support Beware NIC teaming features and VLANs being used for network security HP NCU & promiscuous mode: Page 24 on http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02784628/c02784628.pdf Recommends NCU vNIC and Hyper-V vSwitch for each VLAN for network security
22. System Center as a VM Fine in theory However: Something should not monitor itself Have seen SCVMM and OpsMgr as VMs on production Hyper-V cluster How does this do PRO/alert you if the host they are on has networking issue? Maybe dedicated host/cluster for management VMs
23. Windows Server VM Licensing HUGELY common problem on clusters Typical after P2V or on VMware sites P2V’d OEM OEM tied to original physical server Licensing VMs with individual purchases of Standard edition Allowed to migrate once every 90 days License 2 host cluster, 8 VMs, with 2 * Enterprise Not legal when 5+ VMs on one host (failover)
24. Dynamic Memory .BIN file matches physical RAM allocation Is there enough room on disk to grow? People getting cute with applications that have configurable memory caching? Let apps work as normal SQL Server Check for edition support (Enterprise +) Set VM memory buffer to 5% NUMA – Is performance hit caused by NUMA spanning bad enough to disable NUMA spanning? Memory leaking apps will love Dynamic Memory Default maximum = 64 GB RAM
25. Snapshots Maybe supported by Hyper-V PG but not supported by AD, SQL, Exchange Required shutdown/merge not obvious in GUI People finding all sorts of ways to ruin VMs, e.g. delete a VHD
26. Thank You! Aidan Finn MicroWarehouse Email - AidanFinn@mhw.ie Web - http://www.mwh.ie Personal Twitter - @joe_elway Blog – http://www.aidanfinn.com