2. Agenda
2
• Understanding Windows Clustering
• Working with SQL Clustering
• Monitoring Clustering
• Troubleshooting Clustering
www.optimizesql.com/blog SQLDBA
3. 3
SQL Server High Availability
Goal of High Availability is to keep systems, applications, email,
databases etc always running
HA Overview
www.optimizesql.com/blog SQLDBA
4. 4
Server downtime is unavoidable.
But we have to keep the business running and competitive.
Server may go offline due to
Maintenance
Upgrade
Software or Hardware
Updates
Hot fixes, security patches
Accidently
Power Outages
Disasters
Importance of HA
www.optimizesql.com/blog SQLDBA
5. Group of two or more servers (Nodes) that work
together and represent themselves as single Server
(Virtual Server) in the network.
A server cluster is a collection of servers, called nodes
that communicate with each other to make a set of
services highly available to clients.
Server clusters are designed for applications that have
long running in-memory state or frequently updated
data.
Introduction to Clustering
www.optimizesql.com/blog SQLDBA
6. Introduction
6
• A Microsoft SQL Server Cluster is simply a collection of two or
more physical servers.
• These Servers are called Nodes.
• These nodes have same access to shared storage and
provides the resources required to store the database files
• Each of the nodes talk to one another via a network
• If one node does not communicate to the other node the other
node will take ownership of SQL Server service. This process
is called fail over.
• A failover can occur both automatically (a server stops
communication for some reason) or manually.
www.optimizesql.com/blog SQLDBA
7. Client PCs
Server A Server B
Shared
Disk Array
Heartbeat
Cluster management
SQL Server
Virtual
Server
E F G
C,D C,D
SQL Server
Hub Hub
Basic Architecture
www.optimizesql.com/blog SQLDBA
Binn
Install
Upgrade
Binn
Install
Upgrad
Backup
Data
FTData
Job, Log, repldata
8. Client PCs
Server A Server B
Shared
Disk Array
Heartbeat
Cluster management
SQL Server
Virtual
Server
E F G
C,D C,D
Hub Hub
SQL Server
Basic Architecture
www.optimizesql.com/blog SQLDBA
9. Feature Database
Mirroring
Failover
Clustering
Log Shipping
Data Loss No data loss
option
No data loss Maybe
Failover Automatic failover
option
Automatic failover No
Failover time seconds ~ 20+ seconds Manual
Special
Hardware
No Certified hardware No
Redundancy Complete
redundancy
Disks are shared Complete redundancy
Multiple
Secondaries
No No Yes
Standby Read
Access
Yes, through
snapshot
No Yes, WITH STANDBY
option
Granularity Database Instance Database
Conn String Two ONE Two
Importance of HA
www.optimizesql.com/blog SQLDBA
10. Advantages
10
• High Availability
• Protection from failures
• Server level – hardware and software failures, service
failures etc
• Site level – Fires, earthquake etc
• Online Administration
• Software/hardware upgrades/patch and restart with minimal
downtime.
• Increased Scalability
• In some cases, clustering can be used to increase the
scalability of an application. For example, if a current cluster is
getting too busy, another server could be added to the cluster to
expand the resources and help boost the performance of the
application.
• Clustering is transparent to the calling application.
www.optimizesql.com/blog SQLDBA
11. Advantages
11
• Manageability
• Enables managing resources within entire cluster if we are managing a
single computer.
• Instance level redundancy and automatic failover for
SQL Server
www.optimizesql.com/blog SQLDBA
12. Advantages
12
• Reduces downtime.
• Allows for an automatic response to a failure in hardware/
software.
• Allows you to perform upgrades without forcing users off the
system for extended periods of time.
• Clustering doesn’t require any servers to be renamed. So
when failover occurs, it is relatively transparent to end-users.
• Failing back is relatively quick, and can be done whenever the
primary server is fixed and put back on-line.
• In some cases, clustering can be used to increase the
scalability of an application. For example, if a current cluster is
getting too busy, another server could be added to the cluster
to expand the resources and help boost the performance of the
application
• Clustering is transparent to the calling application.
www.optimizesql.com/blog SQLDBA
13. Disadvantages
13
• Failover Cluster is NOT designed to:
• Protect data
• Protect against a shared disk array from failing.
• Load Balance
• Prevent server from potential data disasters.
• Requires more on-going maintenance than other alternatives.
• Requires more experienced DBAs and network administrators.
www.optimizesql.com/blog SQLDBA
14. Disadvantages
14
• This can be expensive.
• Requires more set up time than other alternatives.
• Requires more on-going maintenance than other alternatives.
• Requires more experienced DBAs and network administrators.
www.optimizesql.com/blog SQLDBA
15. www.optimizesql.com/blog SQLDBA
What SQL Server services can we cluster?
15
• Clusterable
• SQL Server
• SQL Server Agent
• Analysis Services
• Non Clusterable
• SQL Server Integration Services
• SQL Server Reporting Services
• SQL Browser
• SQL Writer
• Full – text search (FTS) service?
• From SQL Server 2008, FTS service is integrated into SQL Server
engine
16. www.optimizesql.com/blog SQLDBA
Active and Passive
16
•SQL Server offers Single Instance Clusters and Multi-Instance
Clusters.
Single Instance
Only one SQL Server Instance running at any given time on your
cluster. It’ll be running either on your 1st node or 2nd
node.(Active-Passive).
Multi Instance
We have 2 nodes running 2 Instances or even 4 Instances of
SQL Server, Or let’s say you’ve 3 nodes where you’ve 2
Instances of SQL Server(Active-Active-Passive), the third node
serving as a standby node ready to take ownership in an event
of any failure of Node1 or Node2.
19. Basic Components
A minimum of two identical servers.
Two NICs are needed per server.
Private, Public
Storage (optional)
Shared disk storage (SAN)
Quorum – (Maintains cluster meta data) – 256MB
MSDTC – (Replication/Dist Trans)
SQL Server (Backup,FTData,data,repldata,log,job)
Tempdb
Data, T.Log Files
Distributed Transaction Coordinator (DTC)
Operating System, service or Application
www.optimizesql.com/blog SQLDBA
Domain Controller.
20. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
20
• SQL Server virtual server
• It is cluster-configured resource group that contains all
resources necessary for SQL Server to operate on the
cluster. This includes
• NetBIOS
• Name of the virtual server,
• TCP/IP address for the virtual server
• All disk drives,
• SQL Server services
21. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
21
• Heartbeat
• A single User Datagram Protocol (UDP) packet is sent
every 500 milliseconds between nodes in the cluster across
the internal private network,
• This packet relays health information about the cluster
nodes as well as health information about the clustered
application
22. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
22
• Failover
• It is the process of one node in the cluster changing states
from offline to online.
• It results in the node taking over responsibility of the SQL
Server virtual server.
• The Cluster Service fails over a group in the event that node
becomes unavailable or one of the resources in the group
fails.
23. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
23
• Failback
• Failback is the process of moving a SQL Server virtual
server that failed over in the cluster back to the original
online node.
24. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
24
• Quorum Resource
• The quorum resource, also referred to as the witness disk in
Windows Server 2008.
• It is the shared disk that holds the cluster server’s
configuration information.
• All servers must be able to contact the quorum resource to
become part of a SQL Server 2008 cluster
25. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
25
• Resource Group
• A collection of cluster resources such as the SQL Server
NetBIOS name, TCP/IP address, and the services belonging
to the SQL Server cluster.
• A resource group also defines the items that fail over to
surviving nodes during failover.
• Resource group is owned by only one node in the cluster at a
time.
26. www.optimizesql.com/blog SQLDBA
Failover Clustering Terminology
26
• LUNs
• An LUN is used to identify a disk or a disk volume that is
presented to a host server or multiple hosts by the shared
storage device.
27. www.optimizesql.com/blog SQLDBA
Preparing Windows Clustering
27
• Pre installation checklist
• Ensure that all nodes are working properly and are configured properly.
• Confirm that each node can access shared array or SAN drives.
• Verify that none of the nodes have been configured as domain controller.
• Verify that all drives are NTFS and are not compressed.
• Ensure that private and public networks are properly configured.
• Verify that you have disabled NetBIOS for all private network cards.
• Verify that Windows Task Scheduler service is running on each node.
• Take a domain admin account for configuring windows cluster.
• Use separate account for cluster service.
• Add cluster service account to the Local Administrators group of all the
nodes in the cluster.
• Decide windows cluster virtual name and virtual IP.
28. www.optimizesql.com/blog SQLDBA
Preparing Windows Clustering
28
• Pre installation checklist
• Ensure that shared drives are available for the following
requirements
• Quorum
• MSDTC
•TempDB
• User Defined database data files
• User Defined database T.Log files
• Backups
29. www.optimizesql.com/blog SQLDBA
Preparing Windows Clustering
29
• IP Address Requirements
Name of Resource IP Address
Private Network – heartbeat (one per node) 2
Public Network (one per node) 2
MSDTC 1
Windows Cluster Name 1
SQL Cluster Name 1