SlideShare a Scribd company logo
1 of 73
Virtualizing Tier One
           Applications
              October 15, 2012
                  Varrow

                 Andrew Miller
Senior Technical Consultant, vExpert, VCP 3/4/5
     t: @andriven w: www.thinkmeta.net
Housekeeping
• If tweeting, include #varrow and maybe #vbca

• Feel free to send me commentary at @andriven

• Hours of stuff packed into a single hour so…

• No shame about content source.
Agenda
1. Top 10 Myths About Virtualizing Business-
   Critical Applications
2. Best Practices for Virtualizing Mission Critical
   Applications (courtesy of @cxi and VMware)
3. Real-world Tools
  – Confio IgniteVM
  – vCenter Operations

  Note: Varrow is 1 of 10 VBCA Competency Holders.
Top 10 Myths
     About Virtualizing
Business-Critical Applications
Myth 2: Newer applications may be "built for the cloud," but my
legacy business-critical applications are not designed to benefit
from cloud infrastructure.
Truth: Virtualization brings cloud-like benefits to existing legacy
applications by providing dynamic scalability, built-in high
availability, provisioning in minutes and automated disaster
recovery at the infrastructure level.
Myth 4: Virtualization is about cost savings, and I'm not willing to
risk the health of my business applications to save on hardware
costs.
Truth: Virtualization is not just about cost reduction. It also helps
improve application quality of service by enabling applications to
scale up or scale out on demand. The need for fully tested disaster
recovery is one of the key drivers for many organizations to
virtualize their most important applications.
                       Site A (Primary)                    Site B (Recovery)

                 VMware               Site Recovery   VMware               Site Recovery
                 vCenter Server       Manager         vCenter Server       Manager




                 VMware vSphere                       VMware vSphere




                            Servers                              Servers
Myth 9: Virtualization can handle everything except my most I/O-
intensive applications.
Truth: vSphere features like storage and network I/O controls, for
example, allow reservations and priorities to enable policy-based
compute, network and storage resource management for business
applications.
“Oh,  and  one  more  ti n g…”
                   h




                                 • Link: http://tinyurl.com/bca-bundle
Best Practices
        for Virtualizing
  Mission Critical Applications
(courtesy of @cxi and VMware)
Virtualizing Tier 1 is Impossible
Why bother?
• Exchange
  –   5x to 10x Consolidation of all Exchange roles
  –   Right-size infrastructure
  –   Ensure service levels with dynamic scalability
  –   Provision Exchange Servers in minutes
  –   Simplify testing and troubleshooting with snapshots and clones
  –   Ensure availability and implement reliable disaster recovery
• SQL
  – Consolidate SQL infrastructure by 20x
  – Cut hardware and software license costs by up to 50%
  – Accelerate database delivery with on-demand provisioning and
    automated release cycles.
  – Ensure availability without the complexity of Microsoft clustering*
Who’s doing it?
• United States Navy/Marine Corps – 750,000
  mailboxes
• University of Plymouth – 40,000 mailboxes
• VMware IT – 9,000 very heavy mailboxes
• University of Texas at Brownsville – 25,000
  mailboxes
• EMC IT – 53,000 mailboxes
Virtual Exchange Start Here
• Refer to Support Policies, Recommendations and
  Best Practice Documents
• Architect for the application, not for the
  virtualization solution
• Pretend like you’re doing it physically… and Just
  do it virtually
• Defaults unless requiring optimization!
Start Simple
• Deploy VMs with similar roles on separate hosts
   – MBX VMs in same DAG should not co-locate
   – Deploy with VMFS
   – Scale up and scale out
   – Spread your CAS around
Licensing Exchange - Virtual!
• One server license is required for each running
  instance of Exchange Server 2010 – whether it is
  installed natively on a physical machine or on a
  virtual machine

• That’s pretty simple!
Configure Storage
• Review the Exchange Calculator to determine your memory, spindle and
  IOPS requirement
• Configure your storage how you would handle it physically, then present
  it to your VMs
• Size your MBX VMDK <2TB
    – Some suggest 2040GB to be on the safe side
• Take advantage of “Optimized for Virtualization” acceleration
  technologies by storage vendors
    – Storage Offloading (VAAI)
    – Per VMDK Locking
• Unlike in the physical world, most data stores host more than one VM
  so account for that IO
• Auto-tiering with small granularity (768k) can result in significant
  storage savings
Exchange Best Practices
• Do not P2V your Exchange Servers
   – Build new servers virtually and move mailboxes
• Split your roles and size their CPU/Mem on a role by
  role basis
• Analyze performance characteristics before and after if
  performing migration
• Less physical servers != fewer resources
Exchange Best Practices
• Size Exchange VMs to fit within NUMA nodes for best
  performance
• Do not over commit memory unless absolutely required
• Consider DAG for local site HA, and SRM for site
  resiliency/DR
Get on the road to Virtual SQL
Virtual SQL Start Here
• Refer to Support Policies, Recommendations and
  Best Practice Documents
• Architect for the application, not for the
  virtualization solution
• Pretend like you’re doing it physically… and Just
  do it virtually
• Defaults unless requiring optimization!
Start Simple
• The average physical SQL Server uses 2 CPUs & 6% utilized,
  3Gb Mem & 60% utilized, ~20 IOPS
• Light workload?
   – Start with 2vCPUs, 3Gb ram
• Heavy workload?
   – Start with 4vCPUs, 8Gb+ ram
• Really Heavy workload?
   – Architect as if physical in the virtual
   – Use a capacity planner tool to assist
• Remember: what’s above is for Tier 1. You can start
  smaller if you want (and it’s good idea overall).
Licensing SQL - Virtual?!
• Standard, Workgroup, Enterprise per proc
   – You must license SQL for each virtual processor
• Standard, Workgroup per Server/CAL
   – You must license each virtual operating system
• Enterprise per physical proc
   – Licensing each physical processor entitles you to run any
     number of SQL server instances
• 2012 switches to per core licensing!
• Unsure? Contact licensing professionals!
Virtualized SQL is blazing fast!
Configure Storage Correctly
• Database LUN needs enough spindles
• Log LUN needs enough spindles
• Mixing sequential (logs) and random (database) can
  result in random behavior
   – Avoid mixing workloads, refer to storage vendor
• Eager-Zeroed Thick VMDK for your Database and Log
  volumes
Configure Storage Continued
• vMotion is supported with SQL Server
• Try to leverage Array Tiering and Acceleration
  technologies if possible
   – Use Array based caching to improve performance
• Most DBs, even High IO ones are hot ~10-15% of the
  database, the rest is cold IO
   – Automatic Tiering makes for higher performance and
     higher efficiency while reducing cost
Migrating SQL
•   Analyze your existing environment
•   Perform a virtualization assessment
•   Pay attention to disk spindles not total space
•   Easy Migration: Use converter to clone server
•   Easier mgmt and provisioning: Use Templates
•   In between: Open Migrator  P2V + vRDM 
    Storage vMotion = VM with vmdk’s.
    – More complicated but minimizes downtime.
Database Best Practices
•   Follow Microsoft Best Practices for SQL Server
•   Evaluate workloads for SQL-intensive ops
•   Consider Scaling Out for high end deployments
•   Defrag SQL Databases
•   Design back-end to support workload (IOPS)
•   Monitor DB/Logs for Disk r/w, Disk Queues
•   Use Fibre-channel connectivity for storage
Configuring Physical Files
• OS/App, Data, Log and TempDB on separate spindles –
  Separate LUNs on single datastore will not provide IO
  separation
• Use RAID10 or RAID5
   – Refer to your storage vendors best practices
• Pre-size data files, do not AUTOGROW
• Pre-size log files, ~10% of DB on average
Configuring TempDB
•   Move TempDB to dedicated LUN
•   # of TempDB files = # of CPU cores
•   All TempDB files should be equal in size
•   Pre-Allocate TempDB space for workload
•   Set file growth increment to minimize expand
•   Microsoft recommends FILEGROWTH incr 10%
FCoTR is the key to the future.
SQL Failover Clustering Best Practices
• Failover clustering is supported with caveats
   – Follow best practices guide for SQL Clustering
   – Use RDMS for DB and Log volumes
   – Use eagerthickzeroed disks
   – Use separate vSCSI controller for OS and Data
   – Use separate vSwitches for Public and Heartbeat
   – Team NICs for network redundancy
SQL Failover Clustering Best Practices
• SQL Database Mirroring (SQL 2008) or AlwaysOn
  Availability Groups (2012) can provide similar
  levels of availability as failover clusters but
  without the strict requirements or vendor
  support issues.
• Most DBs have no failover capability not
  clustered. By making them virtual and letting
  them take advantage of vSphere HA adds
  availability not possible with physical servers
Clusters
• Microsoft does not support migration of running virtual
  machines running cluster software.
  – Caveat*
General Best Practices - Memory
• Allocate your memory based upon your application
  workload
• Database memory doesn’t dedupe well
• Do not over subscribe mission critical workloads
• Do NOT OVER SUBSCRIBE MISSION CRITICAL
  WORKLOADS
  – Use memory reservations for mission critical SQL workloads to
    avoid memory contention issues.
General Best Practices - CPU
• Only allocate vCPUs which are being used
   – Idle vCPUs will compete for system resources
• If workload is unknown, size for fewer vCPUs
   – You can always add more later if reqs demand
• For Performance Critical VMs
   – Try to ensure total number of vCPUs assigned to all
     VMs is <= total number of cores on the host
   – CPU load average of <=1. If greater, add more cpu
General Best Practices - Networking
• Separate vMotion, Logging and console traffic; or use
  VLAN tagging
• Use a paravirtualized vNIC for high performance
  workloads
• Leverage 802.1q using Virtual Switch Tagging (VST) -
  VST is most common configuration
• Follow networking design guidelines
• Do NOT use Jumbo Frames*
   – Let’s chat afterwards if questions.
Alignment
• Ensure your VMs have their disks aligned
   – Boot alignment is auto in 2008, manual in 2003
   – Application LUN is manual, follow application and
     storage vendor best practices




                            Images courtesy of Vaughn Stewart, @vStewed
Links
•   Exchange Links
     –   Microsoft Support Policies and Recommendations for Exchange Servers in Hardware Virtualization Environments
     –   Exchange 2010 on VMware - Best Practices Guide
     –   http://www.vmware.com/pdf/Virtualizing_Exchange2003.pdf
     –   http://www.vmware.com/files/pdf/solutions/08Q4_VM_Exchange_Server_2007_VI3_WP.pdf
     –   http://www.vmware.com/files/pdf/Exchange_2010_on_VMware_-_Best_Practices_Guide.pdf
     –   Microsoft Virtualization Best Practices for Exchange
     –   Policies and Recommendations for Exchange Servers in Virtualization Environments
•   Refer to this great blog series which covers Exchange and VMware
     –   http://www.clearpathsg.com/blogs/2010/07/13/exchange-2010-vsphere-4-best-practices-part-1
     –   http://www.clearpathsg.com/blogs/2010/07/29/exchange-2010-vsphere-4-best-practices-part-2
     –   http://www.clearpathsg.com/blogs/2011/01/13/exchange-2010-vsphere-4-best-practices-part-3
•   Duncan Epping
     –   http://www.yellow-bricks.com/2008/12/17/exchange-2007on-vmware/

•   SQL Links
     –   Best Practices for SQL Server with VMware
     –   Microsoft SQL Server and VMware Virtual Infrastructure Best Practices
     –   Consolidation Guidance for SQL Server
     –   Licensing SQL
     –   Alignment
Database Performance Analysis
      When Virtualized

    (aka Confio IgniteVM)
Monitoring - vSphere
   Get access to vSphere client
    • Need a user account
    • http://<machine> - provides download link
   Why should I use vSphere?
    • Standard O/S Counters may be wrong!
O/S Counter Problem




                       This is what the O/S thinks,
                       but it is based on 6GB.
                       Because of 2GB limit, the
                       correct utilization is 83%
25
VMware Perfmon Counters




                               Special Perfmon
                               Counters on
                               Windows VMs
16
Monitoring - Memory

   Primary Metric – Swapping, Ballooning
   Secondary Metrics – VM & Host Memory Utilization, VM
    Memory Reservation, VM Memory Limit
   Rules
    • If Any Swapping is occurring
        – Host needs more memory because it cannot satisfy current demands
        – Lessen demands for memory – lower reservations where possible
    • Excessive Ballooning
        – May be ok for now, but could be a pending issue
    • VM Memory Utilization High
        – May not be a problem now unless Guest O/S swapping is occurring
        – If VM is limited, may want to increase memory this VM can get
    • If Host Memory Utilization High
        – May not be a problem now if no swapping or ballooning
        – Could be a problem soon for all VMs on this host
CPU Metrics

   Primary Metric – VM Ready Time
   Secondary Metrics – VM CPU Utilization, Host CPU
    Utilization
   Rules
    • If VM Ready Time > 10-20%
        – If Host CPU Utilization is high => Need more CPU resources on Host
        – If Host CPU Utilization ok => VM is limited, give more CPU resources
    • If VM CPU Utilization high (sustained over 80%)
        – May not be a problem now if no ready time
        – could be a problem soon for this VM
    • If Host CPU Utilization high (sustained over 80%)
        – May not be a problem now if no ready time on any VM
        – Could be a problem soon for all VMs on this host
        – Balance VM resources better
Monitoring - Storage

   Primary Metrics – Host maxTotalLatency, Host Device
    Latency (by device), VM Disk Commands Aborted, VM
    Command Latency
   Secondary Metrics – Host Disk Read Rate, Host Disk Write
    Rate, VM Disk Usage Rate
   Rules
    • If Host Latency >= 20-30 ms
        –   Review Device Latencies to understand which one has latencies
        –   Review Disk Read / Write rates
        –   If Close to Storage Capacity - Overloaded Storage
        –   Otherwise - Slow Storage
    • If VM Command Latency >= 30ms only for your VM
        – Tune Disk I/O intensive processes on database
        – Are Memory / CPU issues causing I/O problems
Monitoring - Network

   Primary Metric – Dropped Receive Packets, Dropped
                                        s
    Transmit Packets
   Secondary Metrics – Network Rate
   Rules
    • If any packets are being dropped
       – Look  for  errors  on  te  H t ’s  NIC
                       h            o
       – See if one NIC is getting all traffic
       – Understand which VM is causing the most traffic and reduce it
    • If Network Rate is getting close to maximum for hardware
       – Understand which VM is causing load
       – May need to get better network hardware
Layers and Annotations
This Layer shows
     Database Response Time Metrics



           This Layer shows
        Database Health Metrics



            This Layer shows
     O/S and Virtual Machine Metrics


           This Layer shows
      Metrics for the Physical Host




            This Layer shows
      Metrics for the Storage Layer
40
41
42
Tooltip: Another VM (ProdServerB) moved
               onto this Physical Host




43
44
45
Confio Software

        Award Winning Performance Tools
        Ignite8 for Oracle, SQL Server, DB2, Sybase
        IgniteVM for Databases on VMware
         • Download at www.confio.com
        Provides Answers for
         • What changed recently that affected end users
         • What layer (VM or DB) is causing the problem
         • Who and How should we fix the problem
                   Download free trial at
                     www.confio.com
46
vCenter Operations
4 Big Things

1.   Performance Monitoring
2.   Performance Trending
3.   Capacity Planning
4.   Root Cause Analysis
Managing Performance

  Is it healthy?        Is it enough?      Is it optimised?

• Every VM & ESX      • Enough CPU, RAM,   • Which VMs need
  performing well?      Network, Disk?       adjustment?
  CPU, RAM,             Future risk?       • What are my key
  Network, Disk?      • Time remaining?      ratios?
• Are they behaving   • Capacity           • How much can I
  expectedly?           remaining?           claim back from
• Any fault on any    • Where are the        “fat” VMs?
  component?            “Stress points”    • How many more
                        in time?             VMs can I put
                                             without impacting
                                             performance?
•   Is it healthy = Health
      – Workload
      – Anomalies
      – Faults
•   Is it enough = Risk
      – Time remaining
      – Capacity remaining
      – Stress period
•   Is it optimised = Efficiency
      – What can we reclaim?
      – Density. Key ratios for
          management
Threshold: Shift in Mindset
• vCenter sets “static” threshold, which can be misleading
    – During peak, it is common for VM to reach high utilisation.
        • Static threshold will generate alerts when they should not.
        • vSphere admin quickly learns to ignore them, defeating the purpose of alert to begin with.
    – During non-peak, it might be abnormal for VM to reach even 50% utilisation.
        • Static threshold will not generate alerts when they should have.
• vCenter only sets high threshold
    – Do you set static threshold when CPU or RAM utilisation drops below 5%? 
        • A drop in entire array storage IOPS might be a sign of terrible day ahead.
    – Will not alert when these happen:
        • Utilisation drops from 75% to 1% when it should not.
        • Utilisation change from 5% to 70% when it should not.
    – We need to plots both upper range and lower range
• But each VM differs. And the same VM differs depending on day/time… 
    – Intelligence required to analyse each metrics and their expected “normal”
      behaviour.
m 1m 1                    m

                                                                         0,0                   i, j                      i, j                                                                                                                       0,0   1
                                                                                                                                                m 1m 1                       m                              m 1m 1                m
                                                                                  i 1 j 1                 i m, j 1                                                  1                           1
    P1,1,P1,2 ,...,Pm ,m   ( p1,1, p1,2 ,..., pm,m )                         m 1m 1                         m
                                                                                                                                                          pi , j
                                                                                                                                                              i,j
                                                                                                                                                                                      pi , j
                                                                                                                                                                                          i,j
                                                                                                                                                                                                       1               pi , j              pi , j
                                                                                                                                                i 1 j 1                    i m, j 1                          i 1 j 1            i m, j 1
                                                                   0,0                         i, j                             i, j
                                                                               i 1 j 1                    i m, j 1

                m 1m 1                    m
    where                      pi , j              pi , j    1 0
                                                              ,          pi , j   1 and               z              t z 1e t dt
                                                                                                                 0
                 i 1 j 1                i m, j 1




 The marginal distribution of the i th row of J is:
                                                                                               m 1
                                                            Dirichlet                                        i, j    ,      i ,1 ,       i ,2   ,...,     i ,m 1                                            for i               1 m 1
                                                                                                                                                                                                                                 ,...,
                                                                                                 j 1
     ( pi ,1,..., pi ,m 1 ) 
                                                                                                                           m
                                                            Dirichlet                                     0,0                            m, j     ,     m,1    ,        m,2   ,...,             m, m   ,   0,0    for i               m
                                                                                                                          j 1

                                                                                                      m 1m 1                                          m
                                                            where                        0,0                                      i, j                              i, j
                                                                                                          i 1 j 1                               i m, j 1



                        It is pretty difficult for a human to beat the computer in analysis of the data..
                        The above is one of the many algorithm applied by vCenter Operations.
                        Thank goodness I don’t have to explain this 
Heat Maps
Recap
•   Figures out normal – this is huge.
•   500 VMs, 50 ESX Hosts = 10,000+ Counters
•   Setup and walk away for a while.
•   Walkthrough Demo by Clint Kitson
    – http://www.youtube.com/watch?v=Z-DJuTiqKag
• Less technical but much more fun overviews
    – http://www.vmwarecloudmanagement.com/
• Great in-depth training doc up on VMware
  Communities (179 slides with notes).
    – http://communities.vmware.com/docs/DOC-18592
One last thing…

• Do you own vSphere?

• You own vCenter Operations Foundation now.
  – This covers the Health & Risk items shown.
  – VMware just redid all the vCOPs levels & pricing.
Questions?

(I’m hanging around.)

More Related Content

What's hot

Net1674 final emea
Net1674 final emeaNet1674 final emea
Net1674 final emea
VMworld
 

What's hot (20)

OSCON2014: Understanding Hypervisor Selection in Apache CloudStack
OSCON2014: Understanding Hypervisor Selection in Apache CloudStackOSCON2014: Understanding Hypervisor Selection in Apache CloudStack
OSCON2014: Understanding Hypervisor Selection in Apache CloudStack
 
VMworld 2013: Virtualizing Highly Available SQL Servers
VMworld 2013: Virtualizing Highly Available SQL Servers VMworld 2013: Virtualizing Highly Available SQL Servers
VMworld 2013: Virtualizing Highly Available SQL Servers
 
Net1674 final emea
Net1674 final emeaNet1674 final emea
Net1674 final emea
 
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
 
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The SequelVMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
VMworld Europe 2014: Virtualizing Databases Doing IT Right – The Sequel
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
 
vCenter Operations 5: Level 300 training
vCenter Operations 5: Level 300 trainingvCenter Operations 5: Level 300 training
vCenter Operations 5: Level 300 training
 
Server Consolidation
Server ConsolidationServer Consolidation
Server Consolidation
 
CloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloudCloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloud
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
 
AMER Webcast: VMware Virtual SAN
AMER Webcast: VMware Virtual SANAMER Webcast: VMware Virtual SAN
AMER Webcast: VMware Virtual SAN
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
 
VMworld Europe 2014: A DevOps Story - Unlocking the Power of Docker with the ...
VMworld Europe 2014: A DevOps Story - Unlocking the Power of Docker with the ...VMworld Europe 2014: A DevOps Story - Unlocking the Power of Docker with the ...
VMworld Europe 2014: A DevOps Story - Unlocking the Power of Docker with the ...
 
50 Shades of SharePoint: SharePoint 2013 Insanity Demystified
50 Shades of SharePoint: SharePoint 2013 Insanity Demystified50 Shades of SharePoint: SharePoint 2013 Insanity Demystified
50 Shades of SharePoint: SharePoint 2013 Insanity Demystified
 
Taming the cost of your first cloud - CCCEU 2014
Taming the cost of your first cloud - CCCEU 2014Taming the cost of your first cloud - CCCEU 2014
Taming the cost of your first cloud - CCCEU 2014
 
VMworld 2015: Rethinking Enterprise Storage: Rise Of Hyper Converged Infrastr...
VMworld 2015: Rethinking Enterprise Storage: Rise Of Hyper Converged Infrastr...VMworld 2015: Rethinking Enterprise Storage: Rise Of Hyper Converged Infrastr...
VMworld 2015: Rethinking Enterprise Storage: Rise Of Hyper Converged Infrastr...
 
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
VMworld Europe 2014: Virtual SAN Best Practices and Use CasesVMworld Europe 2014: Virtual SAN Best Practices and Use Cases
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
 
Next Generation Software-Defined Storage
Next Generation Software-Defined StorageNext Generation Software-Defined Storage
Next Generation Software-Defined Storage
 
VMworld Europe 2014: Storage DRS - Deep Dive and Best Practices
VMworld Europe 2014: Storage DRS - Deep Dive and Best PracticesVMworld Europe 2014: Storage DRS - Deep Dive and Best Practices
VMworld Europe 2014: Storage DRS - Deep Dive and Best Practices
 
VMworld Europe 2014: Virtual SAN Architecture Deep Dive
VMworld Europe 2014: Virtual SAN Architecture Deep DiveVMworld Europe 2014: Virtual SAN Architecture Deep Dive
VMworld Europe 2014: Virtual SAN Architecture Deep Dive
 

Similar to Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applications

Varrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentationVarrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentation
pittmantony
 

Similar to Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applications (20)

Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
SQL Server Lift & Shift on Azure - SQL Saturday 921
SQL Server Lift & Shift on Azure - SQL Saturday 921SQL Server Lift & Shift on Azure - SQL Saturday 921
SQL Server Lift & Shift on Azure - SQL Saturday 921
 
SQL Saturday San Diego
SQL Saturday San DiegoSQL Saturday San Diego
SQL Saturday San Diego
 
Nuts and bolts of running a popular site in the aws cloud
Nuts and bolts of running a popular site in the aws cloudNuts and bolts of running a popular site in the aws cloud
Nuts and bolts of running a popular site in the aws cloud
 
Sql server consolidation and virtualization
Sql server consolidation and virtualizationSql server consolidation and virtualization
Sql server consolidation and virtualization
 
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
What is coming for VMware vSphere?
What is coming for VMware vSphere?What is coming for VMware vSphere?
What is coming for VMware vSphere?
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
VMworld 2013: Maximize Database Performance in Your Software-Defined Data Center
VMworld 2013: Maximize Database Performance in Your Software-Defined Data CenterVMworld 2013: Maximize Database Performance in Your Software-Defined Data Center
VMworld 2013: Maximize Database Performance in Your Software-Defined Data Center
 
AWS Summit 2013 | Auckland - Building Web Scale Applications with AWS
AWS Summit 2013 | Auckland - Building Web Scale Applications with AWSAWS Summit 2013 | Auckland - Building Web Scale Applications with AWS
AWS Summit 2013 | Auckland - Building Web Scale Applications with AWS
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
Varrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentationVarrow madness 2013 virtualizing sql presentation
Varrow madness 2013 virtualizing sql presentation
 
Performance stack
Performance stackPerformance stack
Performance stack
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
Oracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructureOracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructure
 
2018 jk
2018 jk2018 jk
2018 jk
 
Un-clouding the cloud
Un-clouding the cloudUn-clouding the cloud
Un-clouding the cloud
 
Microsoft Azure & Hybrid Cloud
Microsoft Azure & Hybrid CloudMicrosoft Azure & Hybrid Cloud
Microsoft Azure & Hybrid Cloud
 

More from Andrew Miller

More from Andrew Miller (8)

MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
 
The Golden Hammer
The Golden HammerThe Golden Hammer
The Golden Hammer
 
Citrix Flexcast + Assessment Approach Lunch & Learn
Citrix Flexcast + Assessment Approach Lunch & LearnCitrix Flexcast + Assessment Approach Lunch & Learn
Citrix Flexcast + Assessment Approach Lunch & Learn
 
Q2 Sirius Lunch & Learn - vSphere 6 & Windows 2003 EoL
Q2 Sirius Lunch & Learn - vSphere 6 & Windows 2003 EoLQ2 Sirius Lunch & Learn - vSphere 6 & Windows 2003 EoL
Q2 Sirius Lunch & Learn - vSphere 6 & Windows 2003 EoL
 
Varrow Madness Sneak Peek
Varrow Madness Sneak PeekVarrow Madness Sneak Peek
Varrow Madness Sneak Peek
 
Varrow Madness 2014 DR Presentation
Varrow Madness 2014 DR PresentationVarrow Madness 2014 DR Presentation
Varrow Madness 2014 DR Presentation
 
AITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + TechnologyAITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + Technology
 
Disaster Recovery - Business & Technology
Disaster Recovery - Business & Technology Disaster Recovery - Business & Technology
Disaster Recovery - Business & Technology
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applications

  • 1. Virtualizing Tier One Applications October 15, 2012 Varrow Andrew Miller Senior Technical Consultant, vExpert, VCP 3/4/5 t: @andriven w: www.thinkmeta.net
  • 2. Housekeeping • If tweeting, include #varrow and maybe #vbca • Feel free to send me commentary at @andriven • Hours of stuff packed into a single hour so… • No shame about content source.
  • 3. Agenda 1. Top 10 Myths About Virtualizing Business- Critical Applications 2. Best Practices for Virtualizing Mission Critical Applications (courtesy of @cxi and VMware) 3. Real-world Tools – Confio IgniteVM – vCenter Operations Note: Varrow is 1 of 10 VBCA Competency Holders.
  • 4. Top 10 Myths About Virtualizing Business-Critical Applications
  • 5.
  • 6.
  • 7. Myth 2: Newer applications may be "built for the cloud," but my legacy business-critical applications are not designed to benefit from cloud infrastructure. Truth: Virtualization brings cloud-like benefits to existing legacy applications by providing dynamic scalability, built-in high availability, provisioning in minutes and automated disaster recovery at the infrastructure level.
  • 8.
  • 9. Myth 4: Virtualization is about cost savings, and I'm not willing to risk the health of my business applications to save on hardware costs. Truth: Virtualization is not just about cost reduction. It also helps improve application quality of service by enabling applications to scale up or scale out on demand. The need for fully tested disaster recovery is one of the key drivers for many organizations to virtualize their most important applications. Site A (Primary) Site B (Recovery) VMware Site Recovery VMware Site Recovery vCenter Server Manager vCenter Server Manager VMware vSphere VMware vSphere Servers Servers
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. Myth 9: Virtualization can handle everything except my most I/O- intensive applications. Truth: vSphere features like storage and network I/O controls, for example, allow reservations and priorities to enable policy-based compute, network and storage resource management for business applications.
  • 15.
  • 16. “Oh,  and  one  more  ti n g…” h • Link: http://tinyurl.com/bca-bundle
  • 17. Best Practices for Virtualizing Mission Critical Applications (courtesy of @cxi and VMware)
  • 18. Virtualizing Tier 1 is Impossible
  • 19. Why bother? • Exchange – 5x to 10x Consolidation of all Exchange roles – Right-size infrastructure – Ensure service levels with dynamic scalability – Provision Exchange Servers in minutes – Simplify testing and troubleshooting with snapshots and clones – Ensure availability and implement reliable disaster recovery • SQL – Consolidate SQL infrastructure by 20x – Cut hardware and software license costs by up to 50% – Accelerate database delivery with on-demand provisioning and automated release cycles. – Ensure availability without the complexity of Microsoft clustering*
  • 20.
  • 21. Who’s doing it? • United States Navy/Marine Corps – 750,000 mailboxes • University of Plymouth – 40,000 mailboxes • VMware IT – 9,000 very heavy mailboxes • University of Texas at Brownsville – 25,000 mailboxes • EMC IT – 53,000 mailboxes
  • 22. Virtual Exchange Start Here • Refer to Support Policies, Recommendations and Best Practice Documents • Architect for the application, not for the virtualization solution • Pretend like you’re doing it physically… and Just do it virtually • Defaults unless requiring optimization!
  • 23. Start Simple • Deploy VMs with similar roles on separate hosts – MBX VMs in same DAG should not co-locate – Deploy with VMFS – Scale up and scale out – Spread your CAS around
  • 24. Licensing Exchange - Virtual! • One server license is required for each running instance of Exchange Server 2010 – whether it is installed natively on a physical machine or on a virtual machine • That’s pretty simple!
  • 25. Configure Storage • Review the Exchange Calculator to determine your memory, spindle and IOPS requirement • Configure your storage how you would handle it physically, then present it to your VMs • Size your MBX VMDK <2TB – Some suggest 2040GB to be on the safe side • Take advantage of “Optimized for Virtualization” acceleration technologies by storage vendors – Storage Offloading (VAAI) – Per VMDK Locking • Unlike in the physical world, most data stores host more than one VM so account for that IO • Auto-tiering with small granularity (768k) can result in significant storage savings
  • 26. Exchange Best Practices • Do not P2V your Exchange Servers – Build new servers virtually and move mailboxes • Split your roles and size their CPU/Mem on a role by role basis • Analyze performance characteristics before and after if performing migration • Less physical servers != fewer resources
  • 27. Exchange Best Practices • Size Exchange VMs to fit within NUMA nodes for best performance • Do not over commit memory unless absolutely required • Consider DAG for local site HA, and SRM for site resiliency/DR
  • 28. Get on the road to Virtual SQL
  • 29. Virtual SQL Start Here • Refer to Support Policies, Recommendations and Best Practice Documents • Architect for the application, not for the virtualization solution • Pretend like you’re doing it physically… and Just do it virtually • Defaults unless requiring optimization!
  • 30. Start Simple • The average physical SQL Server uses 2 CPUs & 6% utilized, 3Gb Mem & 60% utilized, ~20 IOPS • Light workload? – Start with 2vCPUs, 3Gb ram • Heavy workload? – Start with 4vCPUs, 8Gb+ ram • Really Heavy workload? – Architect as if physical in the virtual – Use a capacity planner tool to assist • Remember: what’s above is for Tier 1. You can start smaller if you want (and it’s good idea overall).
  • 31. Licensing SQL - Virtual?! • Standard, Workgroup, Enterprise per proc – You must license SQL for each virtual processor • Standard, Workgroup per Server/CAL – You must license each virtual operating system • Enterprise per physical proc – Licensing each physical processor entitles you to run any number of SQL server instances • 2012 switches to per core licensing! • Unsure? Contact licensing professionals!
  • 32. Virtualized SQL is blazing fast!
  • 33. Configure Storage Correctly • Database LUN needs enough spindles • Log LUN needs enough spindles • Mixing sequential (logs) and random (database) can result in random behavior – Avoid mixing workloads, refer to storage vendor • Eager-Zeroed Thick VMDK for your Database and Log volumes
  • 34. Configure Storage Continued • vMotion is supported with SQL Server • Try to leverage Array Tiering and Acceleration technologies if possible – Use Array based caching to improve performance • Most DBs, even High IO ones are hot ~10-15% of the database, the rest is cold IO – Automatic Tiering makes for higher performance and higher efficiency while reducing cost
  • 35. Migrating SQL • Analyze your existing environment • Perform a virtualization assessment • Pay attention to disk spindles not total space • Easy Migration: Use converter to clone server • Easier mgmt and provisioning: Use Templates • In between: Open Migrator  P2V + vRDM  Storage vMotion = VM with vmdk’s. – More complicated but minimizes downtime.
  • 36. Database Best Practices • Follow Microsoft Best Practices for SQL Server • Evaluate workloads for SQL-intensive ops • Consider Scaling Out for high end deployments • Defrag SQL Databases • Design back-end to support workload (IOPS) • Monitor DB/Logs for Disk r/w, Disk Queues • Use Fibre-channel connectivity for storage
  • 37. Configuring Physical Files • OS/App, Data, Log and TempDB on separate spindles – Separate LUNs on single datastore will not provide IO separation • Use RAID10 or RAID5 – Refer to your storage vendors best practices • Pre-size data files, do not AUTOGROW • Pre-size log files, ~10% of DB on average
  • 38. Configuring TempDB • Move TempDB to dedicated LUN • # of TempDB files = # of CPU cores • All TempDB files should be equal in size • Pre-Allocate TempDB space for workload • Set file growth increment to minimize expand • Microsoft recommends FILEGROWTH incr 10%
  • 39. FCoTR is the key to the future.
  • 40. SQL Failover Clustering Best Practices • Failover clustering is supported with caveats – Follow best practices guide for SQL Clustering – Use RDMS for DB and Log volumes – Use eagerthickzeroed disks – Use separate vSCSI controller for OS and Data – Use separate vSwitches for Public and Heartbeat – Team NICs for network redundancy
  • 41. SQL Failover Clustering Best Practices • SQL Database Mirroring (SQL 2008) or AlwaysOn Availability Groups (2012) can provide similar levels of availability as failover clusters but without the strict requirements or vendor support issues. • Most DBs have no failover capability not clustered. By making them virtual and letting them take advantage of vSphere HA adds availability not possible with physical servers
  • 42. Clusters • Microsoft does not support migration of running virtual machines running cluster software. – Caveat*
  • 43. General Best Practices - Memory • Allocate your memory based upon your application workload • Database memory doesn’t dedupe well • Do not over subscribe mission critical workloads • Do NOT OVER SUBSCRIBE MISSION CRITICAL WORKLOADS – Use memory reservations for mission critical SQL workloads to avoid memory contention issues.
  • 44. General Best Practices - CPU • Only allocate vCPUs which are being used – Idle vCPUs will compete for system resources • If workload is unknown, size for fewer vCPUs – You can always add more later if reqs demand • For Performance Critical VMs – Try to ensure total number of vCPUs assigned to all VMs is <= total number of cores on the host – CPU load average of <=1. If greater, add more cpu
  • 45. General Best Practices - Networking • Separate vMotion, Logging and console traffic; or use VLAN tagging • Use a paravirtualized vNIC for high performance workloads • Leverage 802.1q using Virtual Switch Tagging (VST) - VST is most common configuration • Follow networking design guidelines • Do NOT use Jumbo Frames* – Let’s chat afterwards if questions.
  • 46. Alignment • Ensure your VMs have their disks aligned – Boot alignment is auto in 2008, manual in 2003 – Application LUN is manual, follow application and storage vendor best practices Images courtesy of Vaughn Stewart, @vStewed
  • 47. Links • Exchange Links – Microsoft Support Policies and Recommendations for Exchange Servers in Hardware Virtualization Environments – Exchange 2010 on VMware - Best Practices Guide – http://www.vmware.com/pdf/Virtualizing_Exchange2003.pdf – http://www.vmware.com/files/pdf/solutions/08Q4_VM_Exchange_Server_2007_VI3_WP.pdf – http://www.vmware.com/files/pdf/Exchange_2010_on_VMware_-_Best_Practices_Guide.pdf – Microsoft Virtualization Best Practices for Exchange – Policies and Recommendations for Exchange Servers in Virtualization Environments • Refer to this great blog series which covers Exchange and VMware – http://www.clearpathsg.com/blogs/2010/07/13/exchange-2010-vsphere-4-best-practices-part-1 – http://www.clearpathsg.com/blogs/2010/07/29/exchange-2010-vsphere-4-best-practices-part-2 – http://www.clearpathsg.com/blogs/2011/01/13/exchange-2010-vsphere-4-best-practices-part-3 • Duncan Epping – http://www.yellow-bricks.com/2008/12/17/exchange-2007on-vmware/ • SQL Links – Best Practices for SQL Server with VMware – Microsoft SQL Server and VMware Virtual Infrastructure Best Practices – Consolidation Guidance for SQL Server – Licensing SQL – Alignment
  • 48. Database Performance Analysis When Virtualized (aka Confio IgniteVM)
  • 49. Monitoring - vSphere  Get access to vSphere client • Need a user account • http://<machine> - provides download link  Why should I use vSphere? • Standard O/S Counters may be wrong!
  • 50. O/S Counter Problem This is what the O/S thinks, but it is based on 6GB. Because of 2GB limit, the correct utilization is 83% 25
  • 51. VMware Perfmon Counters Special Perfmon Counters on Windows VMs 16
  • 52. Monitoring - Memory  Primary Metric – Swapping, Ballooning  Secondary Metrics – VM & Host Memory Utilization, VM Memory Reservation, VM Memory Limit  Rules • If Any Swapping is occurring – Host needs more memory because it cannot satisfy current demands – Lessen demands for memory – lower reservations where possible • Excessive Ballooning – May be ok for now, but could be a pending issue • VM Memory Utilization High – May not be a problem now unless Guest O/S swapping is occurring – If VM is limited, may want to increase memory this VM can get • If Host Memory Utilization High – May not be a problem now if no swapping or ballooning – Could be a problem soon for all VMs on this host
  • 53. CPU Metrics  Primary Metric – VM Ready Time  Secondary Metrics – VM CPU Utilization, Host CPU Utilization  Rules • If VM Ready Time > 10-20% – If Host CPU Utilization is high => Need more CPU resources on Host – If Host CPU Utilization ok => VM is limited, give more CPU resources • If VM CPU Utilization high (sustained over 80%) – May not be a problem now if no ready time – could be a problem soon for this VM • If Host CPU Utilization high (sustained over 80%) – May not be a problem now if no ready time on any VM – Could be a problem soon for all VMs on this host – Balance VM resources better
  • 54. Monitoring - Storage  Primary Metrics – Host maxTotalLatency, Host Device Latency (by device), VM Disk Commands Aborted, VM Command Latency  Secondary Metrics – Host Disk Read Rate, Host Disk Write Rate, VM Disk Usage Rate  Rules • If Host Latency >= 20-30 ms – Review Device Latencies to understand which one has latencies – Review Disk Read / Write rates – If Close to Storage Capacity - Overloaded Storage – Otherwise - Slow Storage • If VM Command Latency >= 30ms only for your VM – Tune Disk I/O intensive processes on database – Are Memory / CPU issues causing I/O problems
  • 55. Monitoring - Network  Primary Metric – Dropped Receive Packets, Dropped s Transmit Packets  Secondary Metrics – Network Rate  Rules • If any packets are being dropped – Look  for  errors  on  te  H t ’s  NIC h o – See if one NIC is getting all traffic – Understand which VM is causing the most traffic and reduce it • If Network Rate is getting close to maximum for hardware – Understand which VM is causing load – May need to get better network hardware
  • 57. This Layer shows Database Response Time Metrics This Layer shows Database Health Metrics This Layer shows O/S and Virtual Machine Metrics This Layer shows Metrics for the Physical Host This Layer shows Metrics for the Storage Layer 40
  • 58. 41
  • 59. 42
  • 60. Tooltip: Another VM (ProdServerB) moved onto this Physical Host 43
  • 61. 44
  • 62. 45
  • 63. Confio Software  Award Winning Performance Tools  Ignite8 for Oracle, SQL Server, DB2, Sybase  IgniteVM for Databases on VMware • Download at www.confio.com  Provides Answers for • What changed recently that affected end users • What layer (VM or DB) is causing the problem • Who and How should we fix the problem Download free trial at www.confio.com 46
  • 65. 4 Big Things 1. Performance Monitoring 2. Performance Trending 3. Capacity Planning 4. Root Cause Analysis
  • 66. Managing Performance Is it healthy? Is it enough? Is it optimised? • Every VM & ESX • Enough CPU, RAM, • Which VMs need performing well? Network, Disk? adjustment? CPU, RAM, Future risk? • What are my key Network, Disk? • Time remaining? ratios? • Are they behaving • Capacity • How much can I expectedly? remaining? claim back from • Any fault on any • Where are the “fat” VMs? component? “Stress points” • How many more in time? VMs can I put without impacting performance?
  • 67. Is it healthy = Health – Workload – Anomalies – Faults • Is it enough = Risk – Time remaining – Capacity remaining – Stress period • Is it optimised = Efficiency – What can we reclaim? – Density. Key ratios for management
  • 68. Threshold: Shift in Mindset • vCenter sets “static” threshold, which can be misleading – During peak, it is common for VM to reach high utilisation. • Static threshold will generate alerts when they should not. • vSphere admin quickly learns to ignore them, defeating the purpose of alert to begin with. – During non-peak, it might be abnormal for VM to reach even 50% utilisation. • Static threshold will not generate alerts when they should have. • vCenter only sets high threshold – Do you set static threshold when CPU or RAM utilisation drops below 5%?  • A drop in entire array storage IOPS might be a sign of terrible day ahead. – Will not alert when these happen: • Utilisation drops from 75% to 1% when it should not. • Utilisation change from 5% to 70% when it should not. – We need to plots both upper range and lower range • But each VM differs. And the same VM differs depending on day/time…  – Intelligence required to analyse each metrics and their expected “normal” behaviour.
  • 69. m 1m 1 m 0,0 i, j i, j 0,0 1 m 1m 1 m m 1m 1 m i 1 j 1 i m, j 1 1 1  P1,1,P1,2 ,...,Pm ,m ( p1,1, p1,2 ,..., pm,m ) m 1m 1 m pi , j i,j pi , j i,j 1 pi , j pi , j i 1 j 1 i m, j 1 i 1 j 1 i m, j 1 0,0 i, j i, j i 1 j 1 i m, j 1 m 1m 1 m where pi , j pi , j 1 0 , pi , j 1 and z t z 1e t dt 0 i 1 j 1 i m, j 1  The marginal distribution of the i th row of J is: m 1 Dirichlet i, j , i ,1 , i ,2 ,..., i ,m 1 for i 1 m 1 ,..., j 1 ( pi ,1,..., pi ,m 1 )  m Dirichlet 0,0 m, j , m,1 , m,2 ,..., m, m , 0,0 for i m j 1 m 1m 1 m where 0,0 i, j i, j i 1 j 1 i m, j 1 It is pretty difficult for a human to beat the computer in analysis of the data.. The above is one of the many algorithm applied by vCenter Operations. Thank goodness I don’t have to explain this 
  • 71. Recap • Figures out normal – this is huge. • 500 VMs, 50 ESX Hosts = 10,000+ Counters • Setup and walk away for a while. • Walkthrough Demo by Clint Kitson – http://www.youtube.com/watch?v=Z-DJuTiqKag • Less technical but much more fun overviews – http://www.vmwarecloudmanagement.com/ • Great in-depth training doc up on VMware Communities (179 slides with notes). – http://communities.vmware.com/docs/DOC-18592
  • 72. One last thing… • Do you own vSphere? • You own vCenter Operations Foundation now. – This covers the Health & Risk items shown. – VMware just redid all the vCOPs levels & pricing.

Editor's Notes

  1. Pacing is really important – intentionally too much content.
  2. Show of hands = how many people hands-on with VMware/storage/etc. each day, how many manage those who are hands-on?Tell story of coworker who virtualized application
  3. 15 minutes on this sectionBiz Critical = if it costs money when down or you can&apos;t accept money when it&apos;s down.SAP, Oracle, MS SQL, MS Sharepoint, MS Exchange, custom Java appsHave heard all these myths…..in some cases has taken years to overcome but have been overcome.Tier 1 App Virt = what stops your business when it&apos;s down. Measured in $$/minute or hour.     SQL, Exchange, Oracle, SAP, etc.     Example of retails stores that stop running.     Banks - can&apos;t process.Survey who&apos;s doing what appsSurvey how familiar with server virtualization.
  4. Brief mention of this – think we all know that’s changed.Can tell story of coworker who virtualized app, didn’t tell his boss, hid VMware tools icon….boss didn’t know for a couple days (was during a window where were doing testing…nothing critical and no data to retain). Don’t recommend this but true story…
  5. Skip.Myth 2 - two people who can make you stand up from your desk and go home….CEO and the Exchange admin.
  6. Park here for a bit.Myth 3 - 10 VMS and up as standard consolidation ratio.vMotion – game changer – even on physical boxes.What if for your most mission critical application you dedicated a physical box? Why not? Easier to recover, replicate, etc.
  7. Short - not just about cost savings….about new capabilities….for instance DR testing.
  8. Minimal – suffice it to say we can isolate workloads in multiple ways.
  9. Licensing – the tide has turned here….true in the past but now? Not so much.
  10. Short – followup to slide 6. Even Oracle has changed their stance….plus VMware guarantees to own resolution.
  11. Talk about this Exchange Server Profile Analyzer – for many apps potentially higher performance than today – can increase dynamically.
  12. Park here for a while – ask who has an application that needs 1M IOPs? Or 1 TB memory? Or 32 CPU’s?     Survey on who might hit vSphere 4 maximums much less vSphere 5 Do we need this much? No, not really…just blowing the doors off to remove concerns.     Gartner - says enterprise email should be virtualized by default.     Network World - says same thing - &quot;time to virtualize MS Exchange by default”And by the way, vSphere 5.1 goes to 64 vCPU’s, also 16 GB FC support.
  13. Myth 10     Sue - systems engineer at Raymond James.     Dirk - simplying maintenance, high availability, disaster recovery.
  14. Even some software packaging/discounting around this….we’ll discuss vCenter Operations more later. Ask your sales rep if interested.
  15. 10-15 minutes for Exchange, 10-15 for SQL
  16. Once upon a time, breaking the 4 minute mile was considered impossible. Then one fateful day, May 6th, 1954. Roger Bannister did the impossible. Breaking the 4 minute mile barrier.But what does this have to do with virtualization?
  17. Call out a couple of these – make fun it’s a marketing slide but some legit stuff in there.Last point – not as applicable for SQL 2012 but will still apply to those who can’t go to 2012 for a while.
  18. So you don’t think you can virtualize Tier 1?Well, I have a flow chart to prove you wrong!
  19. Serious deployments – no question.
  20. Park on talking about defaults – that segues into the next slide about KISS.
  21. Call out 768k as being a unique VMAX characteristic.
  22. Ask if people know what NUMA is? Take a minute on it if not.Overcommitting memory? Again, this is Tier 1 – we’re not skimping here.
  23. Q.  How do I license SQL Server 2008 for my virtual environments?A.  For Standard, Workgroup, and Enterprise, if you decide to license on a per processor basis, you must buy a SQL Server license for each virtual processor. For Enterprise Edition, you can also choose to license all physical processors in a box. This gives you rights to run SQL Server on any number of virtual processors running on the same physical server. If you use Server/CAL based licensing, for Standard and Workgroup editions, you must obtain SQL Server licenses for each Virtual Operating System Environment on which you run instances of SQL Server. However, for the Enterprise edition, if you have a Server license for the physical Server, you may run any number of SQL Server instances in any Virtual Operating System Environment that you run on that same physical server. If you are using hardware partitioning on a multi-processor server, you can use any number of virtualized instances for SQL Server Enterprise Edition as long as all processors in that hardware partition are licensed. For example, if you have a partition of 10 physical processors on a 32-processor server, purchasing 10 processor licenses of SQL Server 2008 gives you the rights to run any number of SQL Server instances on physical or virtual environments on that partition.http://www.microsoft.com/sqlserver/2008/en/us/licensing-faq.aspxThe other big selling point is the consolidation of SQL and the potentially very large cost reduction that organizations can see.  If organizations consolidate SQL and save even two CPUs of SQL 2008 R2 Enterprise licensing that could be over a $50,000 savings right there.  There is a large potential for cost savings even with the move to per-core licensing in SQL 2012.http://www.microsoft.com/sqlserver/en/us/get-sql-server/licensing.aspxhttp://blogs.vmware.com/apps/2012/03/virtualizing-sql-server-on-vsphere-licensing.html
  24. See? It’s really fast….or if you’re feeling contrarian, this is what happens when you virtualize SQL…so you should never, ever do it.
  25. Build up to this slide before you show it….discussion about storage protocol wars and how we’ve all been wrong. We’ve all taken a wrong turn for last 15 years but it’s not too late to go back….no more FCoE, iSCSI, or NFS – time to standardize on FCoTR.
  26. Mention how SQL 2012 makes this *much* better.
  27. 5 minutes on Confio – no more.Discuss how we’re collapsing infrastructures now that used to be very isolated. Virtualization also allows us to run many more VM’s than we used to….means we need different/better tools than before.This is NOT that virtualization is bad…rather it removes bottlenecks that then expose other bottlenecks.
  28. What happens to DBA analytics when virtualize?Have to give access to vCenter….b/c their regular OS counters could be wrong…or rather different.
  29. Goes back to the ways we can limit/reserve resources in VMware…and just some of what the hypervisor does. Windows doesn’t control the physical hardware anymore after all.
  30. We do add some VMware type counters inside Windows…but this is more stuff to look at…and guess what? There’s a lot more stuff you can look at…. (next slide)
  31. Just mention how there are bunch of metrics that can be be monitored – call out one of the Rules (i.e. how you respond if a certain metric goes crazy).
  32. You may be realizing we have a lot of layers here? They’re all legitimate and all have value…but when it comes to troubleshooting it makes it…challenging.IgniteVM helps correlate all the lays of the stack – storage to ESX to Windows to SQL in one pane…
  33. ….as this slide shows. So let’s do a sample walkthrough…. (sorry no live demo but the time format just didn’t allow it)
  34. Looking at this period of time we see a very noticeable spike in SQL response time…so what happened? (especially if it’s a critical app where “it just got better” isn’t an acceptable answer….we’re thinking Tier 1 here after all)
  35. A spike in the ESX host CPU usage….
  36. Hmmm…looks like that’s because another VM was moved onto the physical ESX box.
  37. This then drove much higher VM CPU waits (that’s the “ready time”).
  38. Even some definitions of this….DBA’s don’t know all this by default….helps them know what’s happening in the environment.
  39. 10 minutes here about.
  40. what’s critical right nowwhat going to be criticalhow can I optimize/save moneyWhen something does go wrong, how can I correlate a ton of data points to figure it out
  41. This is the real interface – a screenshot yes but the real main dashboard (had someone wondering that at a previous presentation).
  42. We pour all the info in our brain into monitoring software….question of whether it’s worth it or not (talk about Nagios somewhat….incredibly capable but have to configure TONS of stuff).Tell story about batch processing – a certain server might be “normal” to be 100% every Tuesday night. What’s “not normal” is if it’s not 100%....
  43. So you really want to know what vCenter Operations does? Behold!Anyone (in the audience) able to explain this? Because I sure can’t….
  44. Love heat maps…great way to view lots of information quickly.
  45. Pretty sure no one wants to ask questions but I’ll hang around for discussion.