Virtual san hardware guidance & best practices
1. Virtual SAN Hardware Guidance & Best Practices
STO4276
Rakesh Radhakrishnan, VMware
Rafael Kabesa, VMware
2. Student Guide & Internal & Confidential Update Daily
https://goo.gl/VVmVZ0
vSphere Optimization Assessment Showcase http://ouo.io/a8I2mV
Virtual SAN Hardware Guidance & Best Practices http://ouo.io/1S6WBj
Business Critical Applications and SDDC Transformation Solutions http://ouo.io/JmsBF
High Performance Application Platforms http://ouo.io/0mw6P
A Deep Dive: VMware IT's migration of 13 TB EBS Production to 5 node 11gR2 RAC on vSphere http://ouo.io/Nqt7u
Technical Deep Dive -Differentiating vSphere and the vSOMsuite from Windows Server 2012 R2 http://ouo.io/sh6G89
Automation Tips & Tricks: Achieving Results Faster http://ouo.io/z026s
Competing Against Microsoft –Your Top 10 Things to Know http://ouo.io/cs4wj
Best Practices for Installing and Upgrading to the Latest vSphere and vSphere with Operations
Management Releases http://ouo.io/IBp3FO
Gain a Solid Entry into your Customers' Remote and Branch Offices (ROBO) http://ouo.io/9CoOM
Virtualizing Hadoop Workloads with vSphere Big Data Extensions -What's New in version 2.1 http://ouo.io/LNZ6zQ
3. Agenda
• Use Cases - Enabling Business Critical Apps with All Flash Virtual SAN
• How to Build a Virtual SAN
• New Hardware Platforms for Virtual SAN 6.0
• Hardware Guidance & Best Practices
• Virtual SAN Sizing & TCO
• ReferenceArchitectures - All Flash VSAN powered by San Disk & Micron
CONFIDENTIAL
2
5. Virtual SAN For VM Storage: Top Use Cases
VDI
Virtual
Infrastructure
DR
Best storage for VMs
Built for virtual infrastructure
Enterprise-class
Ready for business-critical apps
Business
Critical Apps Test/Dev
5
6. VMware Virtual SAN 6.0 : Hybrid
vSphere + Virtual SAN
…
• Software-defined storage built into vSphere
• Runs on any standard x86 server
• Pools flash-based devices into a shared
datastore
• Managed through per-VM storage policies
• Delivers High performance through flash
acceleration
• 2x more IOPS & 2x VMs with VSAN Hybrid
• Up to 40K IOPS/host & 200 VMs/host
• Highly resilient - zero data loss in the event of
hardware failures
• Deeply integrated with the VMware stack
Virtual SAN
Hard disksSSD
Hard disks Hard disks
SSD SSD
Virtual SAN Datastore
Radically Simple Hypervisor-Converged Storage Software
6
7. VMware Virtual SAN 6.0: All-Flash
vSphere + Virtual SAN
…
Virtual SAN All-Flash
• Flash-based devices used for caching as
well as persistence
• Ideal for Business Critical Applications
• Cost-effective all-flash 2-tier model:
o Cache is 100% write: using write-intensive,
higher grade flash-based devices
o Persistent storage: can leverage lower cost read-
intensive flash-based devices
• Very high IOPS: up to 100K(1) IOPS/Host
• Consistent performance with sub-
millisecond latenciesVirtual SAN All-Flash Datastore
NEW in 6.0
SSDs SSDs SSDs
Extremely High Performance with Predictability
(1) All performance numbers are subject to final benchmarking results. Please refer to guidance published at GA 7
8. Enterprise-Class Scale and Performance
Enhancements
in 6.0
Hosts / Cluster 32 64 64
IOPS / Host 20K 40K 90K
VMs / Host 100 200 200
VMs / Cluster 3200 6000 6000
Virtual SAN
5.5
Virtual SAN
6.0 Hybrid
Virtual SAN
6.0 All-Flash
8
Note: All performance numbers are subject to final benchmarking results. Please refer to guidance published at GA
9. Enabling Business Critical Apps with All Flash Virtual SAN
Site A Site B
9
CONFIDENTIAL
Use Case VSAN 5.5 VSAN 6.0 Standard VSAN 6.0 All Flash
VDI 100 VMs/Host 200 VMs/Host 200 VMs/Host with sub
millisecond latencies
Tier 2 Production 20K IOPS/Host 40K IOPS/Host 100K IOPS/host
Disaster Recovery Target 20K IOPS/Host 40K IOPS/Host 100K IOPS/host
Staging & Test/Dev 20K IOPS/Host 40K IOPS/Host 100K IOPS/host
Tier 1 Workloads &
Business Critical Apps like
Transactional Processing
(OLTP)
N/A 40K IOPS/Host 100K IOPS/Host (Sub-
millisec latencies &
consistently high
performance)
10. Virtual SAN Flash Caching Architectures
disk group disk group
capacity capacity
write buffer write bufferCache Tier
Capacity Tier
Size for remainder of capacity
Lower required endurance
- 0.2 TBW per day is
sufficient
Price on best $/GB
10% of projected used capacity
High Endurance devices
- 2 to 3 TBW per day
Hybrid
All-Flash
Cache Tier disk group disk group 10% of projected used capacity
High Endurance devices
- 2 to 3 TBW per day
70% read cache
30% write buffer
70% read cache
30% write buffer
Capacity Tier capacity capacity Size for remainder of capacity
Magnetic devices
Price on best $/GB
11. All-Flash Cache Tier Sizing
Cache tier should have 10% of the anticipated consumed storage capacity
Cache is entirely write-buffer in all-flash architecture
Cache devices should be high write endurance models: Choose 2+ TBW/day or 3650+/5 years
For highly write intensive workloads, choose 4+ TBW/day or 7300+/5 years
Total cache capacity percentage should be based on use case requirements.
– For general recommendation visit the VMware Compatibility Guide.
– For write-intensive workloads a higher amount should be configured.
– Increase cache size if expecting heavy use of snapshots
Measurement Requirements Values
Projected VM space usage 20GB
Projected number of VMs 1000
Total projected space consumption per VM 20GB x 1000 = 20,000 GB = 20 TB
Target flash cache capacity percentage 10%
Total flash cache capacity required 20TB x .10 = 2 TB
13. …using the VMware Virtual SAN
Compatibility Guide (VCG) (1)
Choose from over 100 HDDs,
150 SDDs, 80 Controllers …
Pick one of 50+ OEM validated
server configurations (2)
Software-Defined Data Center: One Destination, Three Approaches
Hyper-Converged Infrastructure
Hyper-Converged
Infrastructure Appliance
(HCIA) for the SDDC
Each EVO:RAIL HCIA is pre-built on
a qualified and optimized
2U/4 Node server platform
Sold via a single SKU by Qualified
EVO:RAIL Partners(QEPs) (3)
Software + Hardware VMware EVO:RAIL
(1) Components must be chosen from Virtual SAN HCL, using any other components is unsupported – see Virtual SAN VMware Compatibility Guide
(2) VMware continues to update/add list of the available Ready Nodes, please refer to Virtual SAN VMware Compatibility Guide or latest list
(3) EVO:RAIL availability in 2H 2014. Exact dates will vary depending on the specific EVO:RAIL partner
Maximum Flexibility Maximum Ease of Use
Component Based Virtual SAN Ready Node
12
14. Why Virtual SAN Ready Node?
Turnkey solution for accelerating Virtual SAN deployment
Validated server configurations
jointly recommended by VMware
and server OEMs
• Complete with the prescribed:
Memory
Solid-state drive (SSD)
Hard disk drive (HDD)
Controller
Networking
Pre-loaded with vSphere and VSAN(1)
Notes: 1) VMware software pre-loaded on most of the Ready Node
2) Find the latest updated list of VSAN Ready Nodes & OEM SKUs on the Virtual SAN VMware Compatibility Guide Page
Easy to order and
faster time to market
• Single orderable “SKU” per
Ready Node(2)
• Can be quoted/ordered as-is
• Can be customized (always use
certified VSAN components on
Compatibility Guide Page)
Benefit of choices
• Work with your server OEM
of choice
• Choose the Ready Node
profiles based on your
workload
• New license sales or for
customers with ELA
13
16. VSAN 6.0 – New 12G Platforms
15
# OEM Controller Type Server Platform Certification
Status/ETA
1 Dell PERC H730 12G 13G Servers –
Dell FX2, R730
Complete
2 IBM/Lenovo 5110 6G Flex SEN-x240 Complete
3 HP P440ar 12G Gen 9 End of Q1, 2015
4 Cisco LSI 12G SAS 12G UCS M4 End of Q1, 2015
5 IBM/Lenovo 5210 12G 3650HD End of Q1, 2015
6 SuperMicro LSI 3008 12G 12G Haswell End of Q1, 2015
7 Fujitsu LSI 12G SAS 12G 12G Haswell End of Q1, 2015
17. VMware Virtual SAN – New Hardware Platforms
High Density Direct Attached Storage Blades
– Manage disks in enclosures – helps enable
blade environment
– Provides ability to enable Virtual SAN by
adding more storage to blade servers with
few or no local disks
– Flash acceleration provided on the server or in
the subsystem
– Supported on BOTH VSAN 5.5 and 6.0
– Examples:
– IBM Flex SEN with x240 Blade Series
– Dell FX2 with 12G Controllers
Compute Blade Servers
+
Direct Attached Storage
Blades
vSphere + Virtual SAN
18. Virtual SAN Ready Nodes on IBM Flex x240 SEN
Virtual SAN Blade Ready Node on IBM Flex x240 + SEN
7 Servers Blades & 7 Storage Blades Direct Attached in 10U
Supports up to 210 VMs, 50.4TB Raw Capacity, 84K IOPS
14nodesin10U
• 4 scalable switch bays
• 10U Chassis, 14 bays
• Standard and Full width node support
• Up to 6 2500W power supplies
• Up to 8 cooling fans (scalable)
• Integrated chassis management through CME
Flex System Chassis
Software Defined
Storage
VMware VSAN
x240 + SEN
+ hypervisor
Emerging technology:Ability to scale
dedicated pools of storage into a
single virtual storage pool
19. Virtual SAN Ready Nodes on Dell FX2 – High Density 12G Platform
PERC9 H730p – hardware
RAID OR SAS passthru
mode
2.5” HDD/SSD x 8-
4 servers in 2U
Upto 64TB
Capacity
DAS Storage Per Server
-Local: 2x1.8” SATA SSD via Chipset SATA +
-Storage Sled: 8x2.5” SAS/SATA/NL SAS (16TB/Server)
-Redundant SD cards for emb. Hypervisor per Server
-2x1Gb or 2x10Gb LOM + 2 x x8 PCIe (HH/HL) slots per server
logically EACH server looks like a standard 1U rackmount
8 x DDR4 DIMMs
256GB MAX
2x x8 PCIe slots
SAS or SATA
Hypervisor SD Cards
Networking
2x1Gb or 2x10Gb
Chipset SATA
1.8” SATA SSD x 2
20. Checksums & Encryption Support for Virtual SAN
19
Ready Nodes with Hardware Checksum – Q2,2015
Enables end-to-end Data Integrity with T10 Protection Information (PI) DIF
Checksum generated by controller and stored on the disk drive
Enables detection of silent bit corruption
Limited HCL support with checksum enabled Ready Nodes from OEMs
Ready Nodes with Hardware Encryption – Q2 2015
FIPS 140-2 (NIST certified) Encryption enabled by RAID controller
Supports both centralized & local secure key management
Supports replacing & migrating drives and volumes
Limited HCL support with encryption enabled Ready Nodes from OEMs
Software Checksums and Encryption – 2016
Checksums and Encryption will be fully supported in software without
requirement for any specialized hardware
22. Build Your own Virtual SAN node
Any Server on the VMware
Compatibility Guide
4GB to 16GB USB/SD Cards, HDD (for Boot Device)
• SSD, HDD, and Storage Controllers must be listed on the VMware Compatibility Guide for
VSAN http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsan
• Hardware combination selected must be supported by server vendor
1Gb/10Gb NIC
SAS/SATA Controllers (RAID Controllers must work in
“pass-through” or RAID0” mode)
SAS/NL-SAS/SATAHDD
At least 1 of
each
VSAN certified components with ultimate solution design flexibility
SAS/SATA/PCIe SSD
22
23. VMware Compatibility Guide
• Certified Flash Devices, HDD, and Storage Controllers, and Ready Nodes are listed on the VMware Compatibility Guide for
VSAN http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsan
23
25. Design Overview Considerations
Ensure that all the hardware used in your design is supported
by checking the VMware Compatibility Guide (VCG)
Ensure that all software, driver and firmware versions used in
your design are supported by checking the VCG
Ensure you have a “balanced” configuration across all hosts in
the cluster
Design for availability. Consider designing with more than three
hosts and additional capacity that enable the cluster to
automatically remediate in the event of a failure
Design for growth. Consider initial deployment with capacity in
the cluster for future virtual machine deployments, as well as
enough flash cache to accommodate future capacity growth
25
26. Boot Devices
What installation device to use:
– Depends on amount of host memory
– Up to 512 GB
– Use SD/USB devices (4GB or greater) as the installation media.
– SATADOM supported for VSAN 6.0 (Specs coming soon)
– 512 GB or greater
– Use a separate magnetic disk or solid stated disk as the installation
device
26
27. Flash Based Devices
In Virtual SAN ALL read and write operations always go directly to the Flash tier.
Flash based devices serve two purposes in hybrid Virtual SAN
1. Non-volatile Write Buffer (30%)
– Writes are acknowledged when they enter prepare stage on SSD.
– Reduces latency for writes
2. Read Cache (70%)
– Cache hits reduces read latency
– Cache miss – retrieve data from HDD
Flash based devices are used as only write cache in All Flash Virtual SAN
• Reads are performed directly from capacity layer.
Choice of hardware is the #1 performance differentiator between Virtual
SAN configurations.
27
28. Flash Based Devices
VMware SSD Performance Classes
– ClassA: 2,500-5,000 writes per second
– Class B: 5,000-10,000 writes per second
– Class C: 10,000-20,000 writes per second
– Class D: 20,000-30,000 writes per second
– Class E: 30,000+ writes per second
Examples
– Intel DC S3700 SSD ~36000 writes per second -> Class E
– Toshiba SAS SSD MK2001GRZB ~16000 writes per second
-> Class C
Workload Definition
– Queue Depth: 16 or less
– Transfer Length: 4KB
– Operations: write
– Pattern: 100% random
– Latency: less than 5 ms
Endurance
– 10 Drive Writes per Day (DWPD), and
– Random write endurance up to 3.5 PB on 8KB transfer size
per NAND module, or 2.5 PB on 4KB transfer size per
NAND module
28
29. Flash Capacity Sizing
The general recommendation for sizing Virtual SAN's flash capacity is to have 10% of the anticipated
consumed storage capacity before the Number of Failures To Tolerate is considered.
Total flash capacity percentage should be based on use case, capacity and performance requirements.
– 10% is a general recommendation, could be too much or it may not be enough.
Measurement Requirements Values
Projected VM space usage 20GB
Projected number of VMs 1000
Total projected space consumption per VM 20GB x 1000 = 20,000 GB = 20 TB
Target flash capacity percentage 10%
Total flash capacity required 20TB x .10 = 2 TB
30. Endurance Guidance for VSAN Hybrid & All Flash
29
SSD
Endurance
Class
SSD
Tier
TB Writes Per
Day
TB Writes in 5
years
A VSAN All Flash -
Capacity
0.2 365
B VSAN Hybrid -
Caching
1 1825
C VSAN All Flash –
Caching for Medium
workloads
2 3650
D VSAN All Flash –
Caching for High
workloads
4 7300
31. All Flash Hardware Guidance
30
* Final consolidation ratios TBD based on performance runs
VDI-Linked
Clones
VDI-Full Clones Server High Server Medium
Number of VMs
per node
Up to 200* Up to 200* Up to 120* Up to 60*
IOPs per node N/A N/A Up to 80K Up to 60K
Raw storage
capacity per node
1.6 TB 9.6 TB 12 TB 8 TB
CPU 2 * 10 core 2 * 10 core 2 * 12 core 2 * 10 core
Memory 256 GB 256 GB 384 GB 256 GB
Capacity Tier
Flash
4 * 400 GB SSD
Endurance Class A or above
12 * 800 GB SSD
Endurance Class A or above
12 * 1 TB SSD
Endurance Class A or above
8 * 1 TB SSD
Endurance Class A or above
Caching Tier
Flash
1 * 400 GB SSD
Performance Class E
Endurance Class C or above
2 * 400 GB SSD
Performance Class E
Endurance Class C or above
2 * 400 GB SSD
Performance Class E
Endurance Class D or above
2 * 200 GB SSD
Performance Class D or above
Endurance Class C or above
I/O Controller Queue Depth >=256 Queue Depth >=256 Queue Depth >=512 Queue Depth >=256
NIC 10GbE
(Jumbo Frames Enabled)
10GbE
(Jumbo Frames Enabled)
10GbE 10GbE
32. Magnetic Disks (HDD)
• SAS/NL-SAS/SATA HDDs supported
– 7200 RPM for capacity
– 10000 RPM for performance
– 15000 RPM for additional performance
• NL SAS will provide higher HDD controller queue depth at same drive rotational speed and
similar price point
– NL SAS recommended if choosing between SATAand NL SAS
• Differentiate performance between clusters with SSD selection, and SSD:HDD ratio. Rule of
thumb guideline is 10% of anticipated capacity usage
32
33. Storage Controllers
• SAS/SATA Storage Controllers
– Pass-through or “RAID0” mode supported
– Pass through recommended for simplicity and performance
• Performance using RAID0 mode is controller dependent
– Check with your vendor for SSD performance behind a RAID-controller
• Storage Controller Queue Depth matters
– Higher storage controller queue depth will increase performance
– Greater than 256 Queue depth recommended for all but entry level solution profiles
• Validate number of drives supported for each controller
33
34. Storage Controllers – RAID0 Mode
• Configure all disks in RAID0 mode
– Flash based devices (SSD)
– Magnetic disks (HDD)
• Disable the storage controller cache
– Allows better performance as cache is controlled by Virtual SAN
• Disks Device cache support
– Flash based devices leverage write through caching
–
• ESXi may not be able to differentiate flash based devices from magnetic devices.
– Use ESXCLI to manually flag the devices as SSD
34
35. Network
• 1Gb / 10Gb supported for hybrid
– 10Gb shared with NIOC for QoS will support
– If 1GB then recommend dedicated links for Virtual SAN
• 10Gb required forAll Flash
• Jumbo Frames will provide nominal performance increase
– Enable for greenfield deployments
• Consider NIC teaming for availability/redundancy
• Multicast must be configured and functional between all hosts
• Virtual SAN supports both VSS & VDS
– NIOC requires VDS
–
most environments
• Network bandwidth performance has more impact on host evacuation, rebuild times
than on workload performance
35
36. Firewalls
• Virtual SAN Vendor Provider (VSANVP)
– Inbound and outbound - TCP 8080
• Cluster Monitoring, Membership, and Monitoring Services (CMMDS)
– Inbound and outbound UDP 12345 - 23451
• Reliable Datagram Transport (RDT)
– Inbound and outbound TCP 2233
36
39. Why should you care?
1. You want to bring value add to your customers
2. You don’t want to over or under size
3.You want your customer to understand why
• Sizing Recommendation
• VSAN Value
38
40. Sizing & TCO Made Easy
VSANTCO.VMware.com
Or at least easier..