2. Who is Duncan Epping?
Writer
Author
Author
Job
VMware
Social
@ Yellow-Bricks.com
of Essential Virtual SAN
of Clustering Deepdive
Chief Technologist @ VMware
VCDX
@DuncanYB (twitter)
7. It is all about the app
App
VM’s
Compute
StorageNetwork
Clusters
8. What are the things we need to think about?
Consistency is the key to success
• Compute
– DNS / NTP / TPS
• Storage
– Protocol / Limits / Resiliency
• Networking
– vMotion / Management / Storage / VMs
• vSphere HA and DRS
9. 9
Brief intro to vSphere Clusters
vSphere HA Basics
• Configured through vCenter Server
• Each host has an agent (FDM) for monitoring state
• HA restarts VMs when a failure impacts those VMs
10. 10
Brief intro to vSphere Clusters
vSphere HA Specifics
• One of the hosts is elected as master
• Heartbeats via network and storage
– Management network (or)
– VSAN network (if VSAN is enabled)
• It can reserve resources for restarts (Admission Control)
11. 11
Brief intro to vSphere Clusters
vSphere DRS Basics
• DRS provides load balancing and initial
placement
– To keep VMs happy and maximize cluster
utilization
• DRS is the broker of resources between
producers and consumers
• DRS goal is to provide the resources the
virtual machine demands
13. 13
And then there is compute
Many things to think about during install / config
• Gateway / DNS
• NTP
• NUMA
• Syslog + Scratch Partition
• TPS enabled or disabled?
– If enabled, how?
• Security?
– Lock down enabled?
14. 14
Storage, you got an hour or two?
iSCSI, FC, FCoE or maybe VSAN
• Many different storage systems
• Many different design considerations
– And also implications on for instance
vSphere HA
– PDL / APD
– Stretched? Replication? Sync / Async?
• Resignature? Mount? Orchestration of
DR?
• Number of Paths, Number of LUNs
• Performance aspects – RAID Types –
Flash vs Hybrid
15. 15
It is always the network
Yes, we usually do blame others… Reality is, many issues arise from
inconsistency...
• Distributed Switch vs normal vSwitch?
• Consistency in configuration of network
segments
– VLANs / Portgroups
– MTU (end to end)
• Load Balancing
– Load based teaming
– Virtual Port ID
– IP Hash / LACP
30. But I already have a vSphere environment!
• How do I pull the config out of it?
• Leverage PowerCLI as a starting point …
– DRS Rules (affinity, anti-affinity, vm-to-host)
– VDS and Port Group configs
– Resource pools
– Generic cluster configs
– VSAN & SPBM policies
38. 38
Gathering Objects with PowerCLI
• Get information on the cluster
– HA, NTP, SSH, DRS, DNS, so forth
• Compare with declarative configuration
• Inspect results
– Validate always
– Remediate optional
• Report metrics
Every knows HA can respond to a Host Failure
Most people know HA can respond to an isolation but…
Did you know HA can respond to a Guest OS failure?
Did you know HA can respond to an Application failure?
Did you know HA can respond when a VM process failed?
Did you know HA can respond to a Storage failure?
Moving forward, one of the key drivers it to build a data center that can be declared as an end state. This is in opposition to hand crafting a data center as individuals.
Imperative models have long ruled the data center. This is a process in which Operations configures each device to do specific things, typically one at a time, without a real focus on the data center holistically. Declarative models imply that you craft the intent of your resources and allow the lower level system to determine the best way to execute your desires.
Take Uber’s ridesharing app as an example. You instruct the app with your destination and the class of service (UberX, UberBLACK, etc.) you want. It then handles all of the low level details by finding a driver, supplying a route, and processing payment. Do you dwell on how this is done, or do you simply wish to remove friction and consume the service?
How does this sort of value translate into the world of declarative data centers? And how can this be used for the design process?
First, using any sort of configuration management model will largely eliminate the legacy mindset of building by hand as individuals. Instead, statements can be created as a team – in real time – and become actionable because they not only define how a data center should look but also can be feed into a management tool to make change happen.
Once this has been done, change is predictable and repeatable. Because a configuration value has been set by the team, it becomes reality when fed into a configuration management tool. Drift (change) is remediated on a schedule. Otherwise, configuration values are often changed both randomly (on specific servers) and inconsistently (different values based on who made the change and what they believe the value should be).
If you consider this, then, you have now created what is known as a Force Multiplier. The entire team is now empowered to view, create, and enforce consistency within the data center. There is no “one guru person” that knows how things are done, or at least – there shouldn’t be!
Because declarative configurations are also enforced within the data center, they become a living set of documentation. Most all config tools allow for comments and verbose descriptions. Rather than keeping documentation separate from action, why not couple them? After all – documentation is stale the MOMENT it is created, because change is a constant.
It’s important to separate the living state of a system from the declarative configuration of that system.