SlideShare una empresa de Scribd logo
1 de 49
MICROSOFT SQL SERVER
HIGH AVAILABILITY
AND DISASTER RECOVERY
Michael Poremba // October 2008
Database HA & DR
Experience…
 Work with business to determine HA or DR
requirements for applications and data?
 Design HA or DR solutions?
 Administer HA or DR process?
 Still learning MS SQL Server HA & DR
capabilities?
2
Scope of this Presentation
 Data Availability
 Data recovery
 High availability
 Disaster recovery
 Technology Focus
 MS SQL Server
 Physical servers
 SANs
 In-depth how-to
(available elsewhere)
 Partitioned views (federated)
 Advanced DBA techniques
 Custom application logic
 3rd-party software solutions
 Alternate DBMS engines
(e.g. Oracle; DB2)
 HA on virtual machines
 Complex scenarios & solutions
 Load balancing
Presentation Focus Beyond Scope of Presentation
3
So, you need to make your
production database bulletproof…
Introduction to Data
Availability
4
Data Availability Continuum
Degrees of protection for information systems:
Business Risk Solution
Data Recovery Data loss Redundant data
High Availability Downtime of
database service
Redundant
system
components
Disaster
Recovery
Downtime of
business
operations
Redundant
systems
and facilities
5
Business Case for Availability
 Keep business-
critical applications
available
 Secondary:
 Server maintenance
 Protect against loss
of data center
 Secondary:
 Application
upgrades
 Infrastructure
upgrades
High Availability Disaster Recovery
6
Service Level Agreement (SLA)
 Permitted downtime (planned vs. unplanned?)
 Acceptable data/transaction loss
 Application response times
 Mean time to recovery
Note: Database uptime is not equivalent to application availability
 Failures of other application services
 Network outages
Uptime SLA Downtime
per Year
Downtime
per Month
99.9% 8.76 hours 43.8 minutes
99.99% 52.6 minutes 4.38 minutes
99.999% 5.26 minutes 0.438 minutes
7
Protect What?
 Application data stores
 Databases
 Files
 Other data repositories
 Database services
 DBMS availability for applications
 Application services
 Application availability for users and external systems
Databases are the heart of most information
systems;
they deserve the highest affordable protection.
8
Database Failure Scenarios
 Storage subsystem
 Disk
 Controller
 Network
 Server
 Power
 Operator errors
 DBMS interruption
 Drops / deletes
 Application defects
 DBMS defects
 Data corruption
Physical Infrastructure
Failures
Logical Data Failures
9
Service Recovery Strategies
Standb
y
Mode
Failover Behavior SQL Server
Feature
Cold
standb
y
• Manual intervention
required to restore offline
data copy
• Backup and
restore
Warm
standb
y
• Data copy online and
ready
• Manual failover required
• Transaction log
shipping
• Database
mirroring
Hot
standb
y
• Automatic failover • Database
mirroring
• Failover clustering
10
Data Recovery—Terminology
Terminology varies for source vs. copy
High Availability
Strategy
Data Source Data Copy
Backup and Restore Database Backup
Log Shipping Primary Secondary
Standby
Database Mirroring Principal Mirror
Failover Clustering Primary
Active
Secondary
Passive
Standby
Inactive
11
[Briefly…]
Data Recovery12
Database Backups
 Traditional backup types
 Full backup
 Differential backup
 Transaction log backup
 Disk is better than tape
 First backup to disk (separate physical disk volume)
 Detect exceptions encountered during backup
 Verify backup files
 Copy backup files to tape or remote disk
 Data retention policy for backup files
13
Database Backup Strategy
Backup of user databases not sufficient for
recovery
 System database
 Master database
 MSDB database
 Model database
 External data stores…
14
Synch with External Data
Stores
Synchronize recovered database with external
data stores:
 Identity column seeds
 Full-text indexes
(SQL Server 2000)
 LDAP entries
 File system objects
 Other databases
15
Backup Retention Policy
 Location of backup files
 Duration of retention
 Protection of sensitive data
 Sarbanes/Oxley (SOX)
 HIPAA
 Internal policies for data management and
protection
 Access to backups from offsite data storage
16
Data Recovery Process
 Backup file sets
 Full baseline, differential,
and transaction logs
 Retrieving backup files
 Offsite storage
 Tape
 Network copy
 Dependency on multiple
people to get access to
backup files
 Recovery strategy
depends on failure
scenario
 Create comprehensive
failure matrix
 Devise recovery strategy
for each scenario
 Does worst-case recovery
scenario fit within SLA
parameters?
 Recovery time; SLA
 Include future data growth
in recovery plan
 Fully test recovery
strategies—practice is
essential
17
High Availability18
High Availability
 Minimize or avoid service downtime
 Whether planned or unplanned
 When components fail,
service interruption is brief or non-existent
 Automatic failover
 Eliminate single points of failure (as affordable)
 Redundant components
 Fault-tolerant servers
19
Redundant Components
Objective: Avoid single points of failure (where affordable)
Approach: Use redundant components for database service
 Database server nodes
 Server components
 ECC RAM; failure-tolerant HW & OS
 DBMS instance
 User databases
 Storage devices
 Storage unit components
 MPIO: Interfaces; paths; switches; controllers
 RAID: Disks
 Networking
 MPIO: Interfaces; paths; switches
 Data copies
 E.g. Recovering torn page from mirror in SQL Server 2008
20
Transaction Log Shipping
 Warm standby solution
 Duplicate user database
 Copy transaction logs to standby server & restore
 Database available for read-only access
 Users must disconnect for logs to be applied
 Two database licenses required if querying
standby
 Manual application failover
 Supported on standard hardware
 Possible data loss (unapplied transactions)
21
Database Mirroring
 Redundancy at user database level
 Duplicate copy of user database
 Independent storage devices
 Multiple copies of instance databases
 Mirrored over private network channel
 Mirror always redoing transactions from principal
 Negligible impact on transaction throughput
 Multiple mirroring modes:
 High-availability: commit @ log on mirror; automatic
failover
 High-protection: commit @ log on mirror; manual failover
 High-performance: commit when logged on principal
 Very fast automatic failover—seconds
 Requires witness server
 Mirror-aware application client connection
 Provided by client library
 Database connection string must specify both servers
 Mirror may be available for read-only access
(snapshots)
 Works with standard hardware
Local Storage
· local sys DBs
· mirror user DB
Local Storage
· local sys DBs
· source user DB
node A node B
witness
(optional)
22
Mirror Witness
 With mirroring, more than one server is required
to decide on failover
 Witness automates failover from primary to mirror
 Watches database availability
 Reports observations back to principal and mirror
 Runs in separate SQL Server instance (Express is
OK)
 Prevents “split brain” scenario
 Very low resource consumption
 Can be witness for multiple databases
 Not a single point of failure
23
SQL Server Failover Clustering
Shared Storage
· system DBs
· user DBs
· quorum
node A node B
 Two clustered nodes
 Active/Passive config
 MS SQL services
 Running on virtual
server
 Shared storage device
 User databases
 System databases
 Quorum drive
 Redundant internal
components
24
Active/Passive Failover
Clustering
 Redundancy at database instance
level
 All databases fail over together
 Shared copy of system databases
 Single data copy on shared storage
device
 No I/O overhead reducing throughput
 Storage unit is single point of failure
for cluster
 All database services are clustered
 SQL Agent; Analysis Services; Full-
Text engine, MS DTC
 Automatic failover (up to minutes)
 DBMS accessed over virtual IP
 Database not available from inactive
node for DB client connections
 Storage is controlled by one cluster
node at a time
 Requires hardware certified by
Microsoft for Microsoft Cluster
Service
Shared Storage
· system DBs
· user DBs
· quorum
node A node B
25
HA Comparison
 Scope: user DB
 Standard hardware
 One SQL license
(unless querying snapshots
on mirror)
 Very fast failover (seconds)
 OS flexible (e.g. 32/64)
 Independent storage
 Independent services
 Reporting on mirror
 Geographic separation OK
 Scope: DBMS instance
 Certified hardware
 One SQL license
(only one node can access
database)
 Automatic failover (up to
minutes)
 Enterprise OS
 Shared storage
 Clustered services
 Standby not available
 Servers are usually co-
located
Database Mirroring Failover Clustering
26
Considerations for HA
 HA complements backup and recovery strategy
 Does not replace data recovery plan
 Application service availability is often determined
by a network of interdependent services
 Availability can be difficult to define (e.g. partial
failures)
 Failure probability difficult to measure or compute
 Increased system complexity could lead to lower
service availability!
 Operator error a leading cause of availability issues
 Increased number/types of system components
 More complex to configure and administer
27
Data Recovery Requirements
Requirements
Backupand
Recovery
LogShipping
DBMirroring–
High-
Performance
DBMirroring–
High-Protection
DBMirroring–
High-Availability
Failover
Clustering
Cost Low Low/Med Medium Medium Medium High
Relative complexity Low Low Medium Medium High High
Data loss Possible Latest
log
Possible None None None
Scope of duplication Database Databas
e
Databas
e
Database Database DBMS
Failover Downtim
e
Downtim
e
Manual Manual Seconds Up to
minutes
Client redirect Manual Manual Automati
c
Automatic Automatic Automatic
Rolling upgrades &
maint.
No No OS & DB OS & DB OS & DB OS
Access data on Restore Read- Snapsho Snapshot Snapshot No
28
Disaster Recovery29
Disaster Recovery
 Minimize downtime of business operations
 Redundant systems and facilities
 SQL Server features:
 Transaction log shipping
 Database mirroring
 Failover clustering
 Other technologies
 Storage-based mirroring
30
Disaster Recovery Planning
 Data security requirements
 Clarify SLA, data loss allowance
 Evaluate system cost vs. data protection
 Failure analysis
 System redundancy
 Process validation
 Training for personnel
 Prevention practices
 Executing disaster recovery and business continuity
 Practice, practice, practice
31
Business Continuity Facility
 System redundancy
 Systems: Web servers app servers; database,
etc.
 Data: Databases; data files on OS; security info,
etc.
 Networking: Domain, routing, subnet, VIPs, etc.
 Alternate facilities
 Network bandwidth
 Physical or network access by operations staff
 Failover
 Often a deliberate decision, using manual failover
32
Data Redundancy
 Synchronous redundancy
 Network bandwidth cost
 Network latency and application performance
 Network reliability
 Asynchronous redundancy
 Risk of data loss
 More cost-effective
 Resilient to network latency issues
 Candidate Technologies
 SQL Server database mirroring
 Failover clustering with SAN-based mirroring
33
DR Using Database Mirroring
 Two sites: Primary and DR location
 Separate failover clusters at each site
 SQL Server database mirroring between sites
witness
(optional)
Shared Storage B
· local sys DBs
· local quorum
· mirror user DB
node B1 node B2
Shared Storage A
· local sys DBs
· local quorum
· source user DB
node A1 node A2
failover cluster at site A failover cluster at site B
database
mirroring
34
DR Using SAN-Based Mirroring
 Two sites: Primary and DR location
 Four-node failover cluster; one virtual IP
address
 SAN-based mirroring between sites
 Manual cluster failover
Shared Storage B
· system DBs
· quorum
· user DBs
node B1 node B2
Shared Storage A
· system DBs
· quorum
· user DBs
node A1 node A2
failover cluster nodes at site A failover cluster nodes at site B
storage-
based
mirroring
35
[Skip if time is running short.]
Complimentary Technologies36
SAN-Based Data Mirroring
 Data blocks duplicated at storage level
 Similar to transaction log shipping
 Copy performed in sequence and coordinated
with database checkpoint
 Ensures consistency of mirrored data files
 Synchronous or asynchronous mirroring
 Co-located or geographically dispersed—both are
OK
 SAN link bandwidth must support database I/O rate
 May require extra feature support from SAN
vendor
 Could rely on Failover Clustering for HA
37
SQL Server Database
Snapshots
 Read-only point-in-time database snapshot
 No data is copied—instantaneous
 Historical snapshot pages tracked separately
from changing pages
 Snapshots can be maintained indefinitely
 Limited only by available storage
 Snapshot copy can be used for reporting
 Read-only, so no locking issues
38
SQL Server Replication
 Transactional
replication
 High transaction volume
 Low data latency
required
 Mixed technologies:
Integrates with other
DBMS
 Merge replication
 Bi-directional data
changes
 Typically server-to-client
 Snapshot replication
 Large, infrequent data
changes
 Data change latency OK
 Subscriber databases
available for reporting
 Replicate data subsets
 Some data loss is
possible
 Periodically validate
replicated data
39
App Development and Admin40
Considerations for App
Developers
 App services tolerant to database service interruptions
 Application transactions must be handled in code—data
consistency
 Exception handling for transaction retry, connection recovery
 Requires coding standards, code reviews, and testing
 Bulk data operations
 Transaction volume impacts rollback time during failover
 Batch jobs must be run on alternate nodes
 Don’t bypass transaction logging
 Synchronization with external data sources?
 Be aware of database recovery model
 Mirroring uses FailoverPartner in connection string
 Use TCP/IP as client protocol
41
Considerations for Admins
 Use identical server hardware, when possible
 Design network redundancies, when feasible
 Consider network latency for geographic separation
 Always manage through virtual cluster, not individual cluster nodes
 Retest failover/failback after HA maintenance
 Diagnose after failover
 Repair alternate node
 Resynchronize data, as necessary
 Be aware of primary/secondary locations
 Ensure application services are connected and functioning properly
 Keep server node configurations synchronized:
 Service pack and patch levels
 Duplicate non-redundant resources
 Jobs; logins and permissions; OS & sys objects
42
HA Risks
 System performance degradation
 HA system complexity leads to availability
issues
 Some system failures not planned for
 Backup and recovery planning incomplete
 Administrators not fully trained or informed
 User databases not synchronized with other
data sources
43
Common Admin Use Cases
 Maintain HA nodes
 Hardware maintenance
 Rolling upgrades and software patches
 Resynchronize the redundant copy
 Re-synch mirror
 Restart log shipping
 Diagnose and repair
 Diagnose cause of failover
 Repair failed node and restore failover
capabilities
 Test failover and failback
44
Common Admin Actions
Train and practice administrators to:
 Initiate a database mirror
 Manually failover mirror database or cluster
node
 Add/remove passive node from mirror or
cluster
 Upgrade/patch servers nodes
 Restart or redirect application services
45
More Information46
References—Books
 Microsoft SQL Server 2008 High
Availability with Clustering &
Database Mirroring
by Michael Otey, 2009.
 Microsoft SQL Server High
Availability
by Paul Bertucci, 2004.
 Pro SQL Server 2005 High
Availability
by Allan Hirt, 2007.
 Pro SQL Server 2005 Replication
by Sujoy Paul, 2006.
 Pro SQL Server 2005 Service
Broker
by Klaus Aschenbrenner, 2007.
 The Rational Guide to SQL
Server 2005 Service Broker
by Roger Wolter, 2006.
High Availability Related Topics
47
References—Presentations
48
 Microsoft Load Balancing and Clustering
http://ce.sharif.edu/courses/84-85/2/ce317/resources/root/lecture%20slides/
14.%20Microsoft%20Load%20Balancing%20and%20Clustering.ppt
 SQL Server 2005 High Availability
http://www.atlantamdf.com/Presentations/AtlantaMDF_111207HA.ppt
 High Availability Technologies In SQL Server 2000 And SQL Server 2005
http://202.181.238.2/hk/teched2004/ppt/Day_2_Rm407/DAT431(1330-1445).ppt
 Meeting the Availability Challenge
http://download.microsoft.com/download/E/D/C/EDCF54DB-19CD-4882-9FC4-
4F7D46FCEAA6/HighAvailability.ppt
 Disaster Recovery Mistakes
http://www.sqlsig.org/Oct%2011%20DASSUG%20-%20Jason%20Hall%2010-11-
07%20MM.ppt
 SQL Server 2005 High Availability
http://blogs.msdn.com/sql2005event/attachment/564303.ashx
 Effective Usage of SQL Server 2005 Database Mirroring
http://www.sqlserver-qa.net/SSQA-
Effective%20Usage%20of%20SQL%20Server%202005%20Database%20Mirroring_show
.ppt
References—Articles
 Achieve High Availability for SQL Server
http://technet.microsoft.com/en-us/magazine/cc162477.aspx
 Geographically Dispersed Clusters in Windows
Server 2003
http://www.microsoft.com/windowsserver2003/techinfo/overview/clustergeo.
mspx
 Restoring file and filegroup backups
http://support.microsoft.com/kb/281122/en-us
 Restoring specific tables or rows from backups
http://support.microsoft.com/kb/321836/en-us
 Maintaining Availability During Upgrades
http://msdn.microsoft.com/en-us/library/ms191449.aspx
49

Más contenido relacionado

La actualidad más candente

Backup and recovery in oracle
Backup and recovery in oracleBackup and recovery in oracle
Backup and recovery in oraclesadegh salehi
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive Glen Hawkins
 
Always on in sql server 2017
Always on in sql server 2017Always on in sql server 2017
Always on in sql server 2017Gianluca Hotz
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleEDB
 
Backups And Recovery
Backups And RecoveryBackups And Recovery
Backups And Recoveryasifmalik110
 
10 ways to improve your rman script
10 ways to improve your rman script10 ways to improve your rman script
10 ways to improve your rman scriptMaris Elsins
 
Mirroring in SQL Server 2012 R2
Mirroring in SQL Server 2012 R2Mirroring in SQL Server 2012 R2
Mirroring in SQL Server 2012 R2Mahesh Dahal
 
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - OverviewHA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - OverviewMarkus Michalewicz
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed InstanceJames Serra
 
SQL Server High Availability Solutions (Pros & Cons)
SQL Server High Availability Solutions (Pros & Cons)SQL Server High Availability Solutions (Pros & Cons)
SQL Server High Availability Solutions (Pros & Cons)Hamid J. Fard
 
Sql server 2019 new features
Sql server 2019 new featuresSql server 2019 new features
Sql server 2019 new featuresGeorge Walters
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cSatishbabu Gunukula
 
Azure SQL Database Managed Instance - technical overview
Azure SQL Database Managed Instance - technical overviewAzure SQL Database Managed Instance - technical overview
Azure SQL Database Managed Instance - technical overviewGeorge Walters
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginnersPini Dibask
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storageHimanshu Sahu
 
Oracle database performance tuning
Oracle database performance tuningOracle database performance tuning
Oracle database performance tuningYogiji Creations
 
Azure SQL Database
Azure SQL Database Azure SQL Database
Azure SQL Database nj-azure
 
Sql 2012 always on
Sql 2012 always onSql 2012 always on
Sql 2012 always ondilip nayak
 

La actualidad más candente (20)

Backup and recovery in oracle
Backup and recovery in oracleBackup and recovery in oracle
Backup and recovery in oracle
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive
 
Always on in sql server 2017
Always on in sql server 2017Always on in sql server 2017
Always on in sql server 2017
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
 
Backups And Recovery
Backups And RecoveryBackups And Recovery
Backups And Recovery
 
10 ways to improve your rman script
10 ways to improve your rman script10 ways to improve your rman script
10 ways to improve your rman script
 
Mirroring in SQL Server 2012 R2
Mirroring in SQL Server 2012 R2Mirroring in SQL Server 2012 R2
Mirroring in SQL Server 2012 R2
 
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - OverviewHA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
 
Storage basics
Storage basicsStorage basics
Storage basics
 
SQL Server High Availability Solutions (Pros & Cons)
SQL Server High Availability Solutions (Pros & Cons)SQL Server High Availability Solutions (Pros & Cons)
SQL Server High Availability Solutions (Pros & Cons)
 
Sql server 2019 new features
Sql server 2019 new featuresSql server 2019 new features
Sql server 2019 new features
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19c
 
Azure SQL Database Managed Instance - technical overview
Azure SQL Database Managed Instance - technical overviewAzure SQL Database Managed Instance - technical overview
Azure SQL Database Managed Instance - technical overview
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginners
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storage
 
Oracle database performance tuning
Oracle database performance tuningOracle database performance tuning
Oracle database performance tuning
 
Azure SQL Database
Azure SQL Database Azure SQL Database
Azure SQL Database
 
AlwaysON Basics
AlwaysON BasicsAlwaysON Basics
AlwaysON Basics
 
Sql 2012 always on
Sql 2012 always onSql 2012 always on
Sql 2012 always on
 

Similar a SQL Server High Availability and Disaster Recovery

Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL ServerBob Roudebush
 
SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQLPASSTW
 
DB2 for z/O S Data Sharing
DB2 for z/O S  Data  SharingDB2 for z/O S  Data  Sharing
DB2 for z/O S Data SharingSurekha Parekh
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft Private Cloud
 
High Availability Options for DB2 Data Centre
High Availability Options for DB2 Data CentreHigh Availability Options for DB2 Data Centre
High Availability Options for DB2 Data Centreterraborealis
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session IVMware Tanzu
 
vFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckvFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckJunchi Zhang
 
Oracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for ConsolidationOracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for ConsolidationYudi Herdiana
 
07_DP_300T00A_HA_Disaster_Recovery.pptx
07_DP_300T00A_HA_Disaster_Recovery.pptx07_DP_300T00A_HA_Disaster_Recovery.pptx
07_DP_300T00A_HA_Disaster_Recovery.pptxKareemBullard1
 
Continuity Software 4.3 Detailed Gaps
Continuity Software 4.3 Detailed GapsContinuity Software 4.3 Detailed Gaps
Continuity Software 4.3 Detailed GapsGilHecht
 
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...Eric Shupps
 
Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Jovan Popovic
 
Mmckeown hadr that_conf
Mmckeown hadr that_confMmckeown hadr that_conf
Mmckeown hadr that_confMike McKeown
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudipasalapudi123
 
Dr and ha solutions with sql server azure
Dr and ha solutions with sql server azureDr and ha solutions with sql server azure
Dr and ha solutions with sql server azureMSDEVMTL
 
SharePoint Backup And Disaster Recovery with Joel Oleson
SharePoint Backup And Disaster Recovery with Joel OlesonSharePoint Backup And Disaster Recovery with Joel Oleson
SharePoint Backup And Disaster Recovery with Joel OlesonJoel Oleson
 
DB2 pureScale Overview Sept 2010
DB2 pureScale Overview Sept 2010DB2 pureScale Overview Sept 2010
DB2 pureScale Overview Sept 2010Laura Hood
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Managementguest2e11e8
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMWASdev Community
 

Similar a SQL Server High Availability and Disaster Recovery (20)

Availability Considerations for SQL Server
Availability Considerations for SQL ServerAvailability Considerations for SQL Server
Availability Considerations for SQL Server
 
SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1
 
DB2 for z/O S Data Sharing
DB2 for z/O S  Data  SharingDB2 for z/O S  Data  Sharing
DB2 for z/O S Data Sharing
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
 
High Availability Options for DB2 Data Centre
High Availability Options for DB2 Data CentreHigh Availability Options for DB2 Data Centre
High Availability Options for DB2 Data Centre
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session I
 
Sql 2005 high availability
Sql 2005 high availabilitySql 2005 high availability
Sql 2005 high availability
 
vFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckvFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deck
 
Oracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for ConsolidationOracle Database 12c Multitenant for Consolidation
Oracle Database 12c Multitenant for Consolidation
 
07_DP_300T00A_HA_Disaster_Recovery.pptx
07_DP_300T00A_HA_Disaster_Recovery.pptx07_DP_300T00A_HA_Disaster_Recovery.pptx
07_DP_300T00A_HA_Disaster_Recovery.pptx
 
Continuity Software 4.3 Detailed Gaps
Continuity Software 4.3 Detailed GapsContinuity Software 4.3 Detailed Gaps
Continuity Software 4.3 Detailed Gaps
 
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...
SharePoint 24x7x365 Architecting for High Availability, Fault Tolerance and D...
 
Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019
 
Mmckeown hadr that_conf
Mmckeown hadr that_confMmckeown hadr that_conf
Mmckeown hadr that_conf
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudi
 
Dr and ha solutions with sql server azure
Dr and ha solutions with sql server azureDr and ha solutions with sql server azure
Dr and ha solutions with sql server azure
 
SharePoint Backup And Disaster Recovery with Joel Oleson
SharePoint Backup And Disaster Recovery with Joel OlesonSharePoint Backup And Disaster Recovery with Joel Oleson
SharePoint Backup And Disaster Recovery with Joel Oleson
 
DB2 pureScale Overview Sept 2010
DB2 pureScale Overview Sept 2010DB2 pureScale Overview Sept 2010
DB2 pureScale Overview Sept 2010
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Management
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
 

Último

RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 

Último (20)

RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 

SQL Server High Availability and Disaster Recovery

  • 1. MICROSOFT SQL SERVER HIGH AVAILABILITY AND DISASTER RECOVERY Michael Poremba // October 2008
  • 2. Database HA & DR Experience…  Work with business to determine HA or DR requirements for applications and data?  Design HA or DR solutions?  Administer HA or DR process?  Still learning MS SQL Server HA & DR capabilities? 2
  • 3. Scope of this Presentation  Data Availability  Data recovery  High availability  Disaster recovery  Technology Focus  MS SQL Server  Physical servers  SANs  In-depth how-to (available elsewhere)  Partitioned views (federated)  Advanced DBA techniques  Custom application logic  3rd-party software solutions  Alternate DBMS engines (e.g. Oracle; DB2)  HA on virtual machines  Complex scenarios & solutions  Load balancing Presentation Focus Beyond Scope of Presentation 3
  • 4. So, you need to make your production database bulletproof… Introduction to Data Availability 4
  • 5. Data Availability Continuum Degrees of protection for information systems: Business Risk Solution Data Recovery Data loss Redundant data High Availability Downtime of database service Redundant system components Disaster Recovery Downtime of business operations Redundant systems and facilities 5
  • 6. Business Case for Availability  Keep business- critical applications available  Secondary:  Server maintenance  Protect against loss of data center  Secondary:  Application upgrades  Infrastructure upgrades High Availability Disaster Recovery 6
  • 7. Service Level Agreement (SLA)  Permitted downtime (planned vs. unplanned?)  Acceptable data/transaction loss  Application response times  Mean time to recovery Note: Database uptime is not equivalent to application availability  Failures of other application services  Network outages Uptime SLA Downtime per Year Downtime per Month 99.9% 8.76 hours 43.8 minutes 99.99% 52.6 minutes 4.38 minutes 99.999% 5.26 minutes 0.438 minutes 7
  • 8. Protect What?  Application data stores  Databases  Files  Other data repositories  Database services  DBMS availability for applications  Application services  Application availability for users and external systems Databases are the heart of most information systems; they deserve the highest affordable protection. 8
  • 9. Database Failure Scenarios  Storage subsystem  Disk  Controller  Network  Server  Power  Operator errors  DBMS interruption  Drops / deletes  Application defects  DBMS defects  Data corruption Physical Infrastructure Failures Logical Data Failures 9
  • 10. Service Recovery Strategies Standb y Mode Failover Behavior SQL Server Feature Cold standb y • Manual intervention required to restore offline data copy • Backup and restore Warm standb y • Data copy online and ready • Manual failover required • Transaction log shipping • Database mirroring Hot standb y • Automatic failover • Database mirroring • Failover clustering 10
  • 11. Data Recovery—Terminology Terminology varies for source vs. copy High Availability Strategy Data Source Data Copy Backup and Restore Database Backup Log Shipping Primary Secondary Standby Database Mirroring Principal Mirror Failover Clustering Primary Active Secondary Passive Standby Inactive 11
  • 13. Database Backups  Traditional backup types  Full backup  Differential backup  Transaction log backup  Disk is better than tape  First backup to disk (separate physical disk volume)  Detect exceptions encountered during backup  Verify backup files  Copy backup files to tape or remote disk  Data retention policy for backup files 13
  • 14. Database Backup Strategy Backup of user databases not sufficient for recovery  System database  Master database  MSDB database  Model database  External data stores… 14
  • 15. Synch with External Data Stores Synchronize recovered database with external data stores:  Identity column seeds  Full-text indexes (SQL Server 2000)  LDAP entries  File system objects  Other databases 15
  • 16. Backup Retention Policy  Location of backup files  Duration of retention  Protection of sensitive data  Sarbanes/Oxley (SOX)  HIPAA  Internal policies for data management and protection  Access to backups from offsite data storage 16
  • 17. Data Recovery Process  Backup file sets  Full baseline, differential, and transaction logs  Retrieving backup files  Offsite storage  Tape  Network copy  Dependency on multiple people to get access to backup files  Recovery strategy depends on failure scenario  Create comprehensive failure matrix  Devise recovery strategy for each scenario  Does worst-case recovery scenario fit within SLA parameters?  Recovery time; SLA  Include future data growth in recovery plan  Fully test recovery strategies—practice is essential 17
  • 19. High Availability  Minimize or avoid service downtime  Whether planned or unplanned  When components fail, service interruption is brief or non-existent  Automatic failover  Eliminate single points of failure (as affordable)  Redundant components  Fault-tolerant servers 19
  • 20. Redundant Components Objective: Avoid single points of failure (where affordable) Approach: Use redundant components for database service  Database server nodes  Server components  ECC RAM; failure-tolerant HW & OS  DBMS instance  User databases  Storage devices  Storage unit components  MPIO: Interfaces; paths; switches; controllers  RAID: Disks  Networking  MPIO: Interfaces; paths; switches  Data copies  E.g. Recovering torn page from mirror in SQL Server 2008 20
  • 21. Transaction Log Shipping  Warm standby solution  Duplicate user database  Copy transaction logs to standby server & restore  Database available for read-only access  Users must disconnect for logs to be applied  Two database licenses required if querying standby  Manual application failover  Supported on standard hardware  Possible data loss (unapplied transactions) 21
  • 22. Database Mirroring  Redundancy at user database level  Duplicate copy of user database  Independent storage devices  Multiple copies of instance databases  Mirrored over private network channel  Mirror always redoing transactions from principal  Negligible impact on transaction throughput  Multiple mirroring modes:  High-availability: commit @ log on mirror; automatic failover  High-protection: commit @ log on mirror; manual failover  High-performance: commit when logged on principal  Very fast automatic failover—seconds  Requires witness server  Mirror-aware application client connection  Provided by client library  Database connection string must specify both servers  Mirror may be available for read-only access (snapshots)  Works with standard hardware Local Storage · local sys DBs · mirror user DB Local Storage · local sys DBs · source user DB node A node B witness (optional) 22
  • 23. Mirror Witness  With mirroring, more than one server is required to decide on failover  Witness automates failover from primary to mirror  Watches database availability  Reports observations back to principal and mirror  Runs in separate SQL Server instance (Express is OK)  Prevents “split brain” scenario  Very low resource consumption  Can be witness for multiple databases  Not a single point of failure 23
  • 24. SQL Server Failover Clustering Shared Storage · system DBs · user DBs · quorum node A node B  Two clustered nodes  Active/Passive config  MS SQL services  Running on virtual server  Shared storage device  User databases  System databases  Quorum drive  Redundant internal components 24
  • 25. Active/Passive Failover Clustering  Redundancy at database instance level  All databases fail over together  Shared copy of system databases  Single data copy on shared storage device  No I/O overhead reducing throughput  Storage unit is single point of failure for cluster  All database services are clustered  SQL Agent; Analysis Services; Full- Text engine, MS DTC  Automatic failover (up to minutes)  DBMS accessed over virtual IP  Database not available from inactive node for DB client connections  Storage is controlled by one cluster node at a time  Requires hardware certified by Microsoft for Microsoft Cluster Service Shared Storage · system DBs · user DBs · quorum node A node B 25
  • 26. HA Comparison  Scope: user DB  Standard hardware  One SQL license (unless querying snapshots on mirror)  Very fast failover (seconds)  OS flexible (e.g. 32/64)  Independent storage  Independent services  Reporting on mirror  Geographic separation OK  Scope: DBMS instance  Certified hardware  One SQL license (only one node can access database)  Automatic failover (up to minutes)  Enterprise OS  Shared storage  Clustered services  Standby not available  Servers are usually co- located Database Mirroring Failover Clustering 26
  • 27. Considerations for HA  HA complements backup and recovery strategy  Does not replace data recovery plan  Application service availability is often determined by a network of interdependent services  Availability can be difficult to define (e.g. partial failures)  Failure probability difficult to measure or compute  Increased system complexity could lead to lower service availability!  Operator error a leading cause of availability issues  Increased number/types of system components  More complex to configure and administer 27
  • 28. Data Recovery Requirements Requirements Backupand Recovery LogShipping DBMirroring– High- Performance DBMirroring– High-Protection DBMirroring– High-Availability Failover Clustering Cost Low Low/Med Medium Medium Medium High Relative complexity Low Low Medium Medium High High Data loss Possible Latest log Possible None None None Scope of duplication Database Databas e Databas e Database Database DBMS Failover Downtim e Downtim e Manual Manual Seconds Up to minutes Client redirect Manual Manual Automati c Automatic Automatic Automatic Rolling upgrades & maint. No No OS & DB OS & DB OS & DB OS Access data on Restore Read- Snapsho Snapshot Snapshot No 28
  • 30. Disaster Recovery  Minimize downtime of business operations  Redundant systems and facilities  SQL Server features:  Transaction log shipping  Database mirroring  Failover clustering  Other technologies  Storage-based mirroring 30
  • 31. Disaster Recovery Planning  Data security requirements  Clarify SLA, data loss allowance  Evaluate system cost vs. data protection  Failure analysis  System redundancy  Process validation  Training for personnel  Prevention practices  Executing disaster recovery and business continuity  Practice, practice, practice 31
  • 32. Business Continuity Facility  System redundancy  Systems: Web servers app servers; database, etc.  Data: Databases; data files on OS; security info, etc.  Networking: Domain, routing, subnet, VIPs, etc.  Alternate facilities  Network bandwidth  Physical or network access by operations staff  Failover  Often a deliberate decision, using manual failover 32
  • 33. Data Redundancy  Synchronous redundancy  Network bandwidth cost  Network latency and application performance  Network reliability  Asynchronous redundancy  Risk of data loss  More cost-effective  Resilient to network latency issues  Candidate Technologies  SQL Server database mirroring  Failover clustering with SAN-based mirroring 33
  • 34. DR Using Database Mirroring  Two sites: Primary and DR location  Separate failover clusters at each site  SQL Server database mirroring between sites witness (optional) Shared Storage B · local sys DBs · local quorum · mirror user DB node B1 node B2 Shared Storage A · local sys DBs · local quorum · source user DB node A1 node A2 failover cluster at site A failover cluster at site B database mirroring 34
  • 35. DR Using SAN-Based Mirroring  Two sites: Primary and DR location  Four-node failover cluster; one virtual IP address  SAN-based mirroring between sites  Manual cluster failover Shared Storage B · system DBs · quorum · user DBs node B1 node B2 Shared Storage A · system DBs · quorum · user DBs node A1 node A2 failover cluster nodes at site A failover cluster nodes at site B storage- based mirroring 35
  • 36. [Skip if time is running short.] Complimentary Technologies36
  • 37. SAN-Based Data Mirroring  Data blocks duplicated at storage level  Similar to transaction log shipping  Copy performed in sequence and coordinated with database checkpoint  Ensures consistency of mirrored data files  Synchronous or asynchronous mirroring  Co-located or geographically dispersed—both are OK  SAN link bandwidth must support database I/O rate  May require extra feature support from SAN vendor  Could rely on Failover Clustering for HA 37
  • 38. SQL Server Database Snapshots  Read-only point-in-time database snapshot  No data is copied—instantaneous  Historical snapshot pages tracked separately from changing pages  Snapshots can be maintained indefinitely  Limited only by available storage  Snapshot copy can be used for reporting  Read-only, so no locking issues 38
  • 39. SQL Server Replication  Transactional replication  High transaction volume  Low data latency required  Mixed technologies: Integrates with other DBMS  Merge replication  Bi-directional data changes  Typically server-to-client  Snapshot replication  Large, infrequent data changes  Data change latency OK  Subscriber databases available for reporting  Replicate data subsets  Some data loss is possible  Periodically validate replicated data 39
  • 41. Considerations for App Developers  App services tolerant to database service interruptions  Application transactions must be handled in code—data consistency  Exception handling for transaction retry, connection recovery  Requires coding standards, code reviews, and testing  Bulk data operations  Transaction volume impacts rollback time during failover  Batch jobs must be run on alternate nodes  Don’t bypass transaction logging  Synchronization with external data sources?  Be aware of database recovery model  Mirroring uses FailoverPartner in connection string  Use TCP/IP as client protocol 41
  • 42. Considerations for Admins  Use identical server hardware, when possible  Design network redundancies, when feasible  Consider network latency for geographic separation  Always manage through virtual cluster, not individual cluster nodes  Retest failover/failback after HA maintenance  Diagnose after failover  Repair alternate node  Resynchronize data, as necessary  Be aware of primary/secondary locations  Ensure application services are connected and functioning properly  Keep server node configurations synchronized:  Service pack and patch levels  Duplicate non-redundant resources  Jobs; logins and permissions; OS & sys objects 42
  • 43. HA Risks  System performance degradation  HA system complexity leads to availability issues  Some system failures not planned for  Backup and recovery planning incomplete  Administrators not fully trained or informed  User databases not synchronized with other data sources 43
  • 44. Common Admin Use Cases  Maintain HA nodes  Hardware maintenance  Rolling upgrades and software patches  Resynchronize the redundant copy  Re-synch mirror  Restart log shipping  Diagnose and repair  Diagnose cause of failover  Repair failed node and restore failover capabilities  Test failover and failback 44
  • 45. Common Admin Actions Train and practice administrators to:  Initiate a database mirror  Manually failover mirror database or cluster node  Add/remove passive node from mirror or cluster  Upgrade/patch servers nodes  Restart or redirect application services 45
  • 47. References—Books  Microsoft SQL Server 2008 High Availability with Clustering & Database Mirroring by Michael Otey, 2009.  Microsoft SQL Server High Availability by Paul Bertucci, 2004.  Pro SQL Server 2005 High Availability by Allan Hirt, 2007.  Pro SQL Server 2005 Replication by Sujoy Paul, 2006.  Pro SQL Server 2005 Service Broker by Klaus Aschenbrenner, 2007.  The Rational Guide to SQL Server 2005 Service Broker by Roger Wolter, 2006. High Availability Related Topics 47
  • 48. References—Presentations 48  Microsoft Load Balancing and Clustering http://ce.sharif.edu/courses/84-85/2/ce317/resources/root/lecture%20slides/ 14.%20Microsoft%20Load%20Balancing%20and%20Clustering.ppt  SQL Server 2005 High Availability http://www.atlantamdf.com/Presentations/AtlantaMDF_111207HA.ppt  High Availability Technologies In SQL Server 2000 And SQL Server 2005 http://202.181.238.2/hk/teched2004/ppt/Day_2_Rm407/DAT431(1330-1445).ppt  Meeting the Availability Challenge http://download.microsoft.com/download/E/D/C/EDCF54DB-19CD-4882-9FC4- 4F7D46FCEAA6/HighAvailability.ppt  Disaster Recovery Mistakes http://www.sqlsig.org/Oct%2011%20DASSUG%20-%20Jason%20Hall%2010-11- 07%20MM.ppt  SQL Server 2005 High Availability http://blogs.msdn.com/sql2005event/attachment/564303.ashx  Effective Usage of SQL Server 2005 Database Mirroring http://www.sqlserver-qa.net/SSQA- Effective%20Usage%20of%20SQL%20Server%202005%20Database%20Mirroring_show .ppt
  • 49. References—Articles  Achieve High Availability for SQL Server http://technet.microsoft.com/en-us/magazine/cc162477.aspx  Geographically Dispersed Clusters in Windows Server 2003 http://www.microsoft.com/windowsserver2003/techinfo/overview/clustergeo. mspx  Restoring file and filegroup backups http://support.microsoft.com/kb/281122/en-us  Restoring specific tables or rows from backups http://support.microsoft.com/kb/321836/en-us  Maintaining Availability During Upgrades http://msdn.microsoft.com/en-us/library/ms191449.aspx 49