2. קצת עלי
מי אני - DBA
נשוי + 3 •
למעלה מ01 שנים בתחום •
מוסמך PROבטכנולוגיות Sql Server •
ו-Oracle
לשעבר CTOומוביל תחום בג'ון ברייס •
הדרכה
מנכ"ל חברת DBCSהעוסקת בתחום •
יועץ ל- בזק בינ"ל, הלמ"ס, פונטיס, •
טרפילוג, storenext ,galcommועוד.
3. • Introduction to High Availability in SQL Server:
Hardware and software solutions
• Features and techniques comparison
– Log Shipping
– Database Mirroring
– Replication
– Database Snapshots
– Backup improvements
– Online operations
• HADR deep dive: How to implement the next
generation of high availability and disaster
recovery solution with SQL Server
4. Introduction to High Availability
and Disaster Recovery
• Definitions
– Introduce key terms and concepts
• Business Continuity Planning
– Overview of the BCP process
• SQL Server High Availability Planning
– How does BCP apply to SQL Server availability?
5. High Availability and Disaster
Recovery: Definition
• High Availability • Disaster Recovery
• High availability is a system design
protocol and associated • Processes and procedures
implementation that ensures a designed to restore business
certain absolute degree of
operational continuity during a given operations due to a natural or
measurement period human-induced disaster
– Typically involves providing
• Availability defined in terms of redundancy spanning multiple
service level agreements (SLA)
– Recovery Time sites or across geographic
– Data loss during unplanned regions
downtime
• A highly available application should
be accessible by users x% of the
time
6. Defining x and SLA
Availability Acceptable Acceptable Data • Recovery Time Objective (RTO)
Class Downtime (hrs/yr) Loss (time of last
OR RTO copy) OR RPO guided by availability requirements
– How much downtime can you tolerate?
Tier 1 >99.99% 5 min or less
(1 hr or less)
Tier 2 99.9% - 99.99% (1- 5 mins to 8.5 hrs • Recovery Point Objective (RPO)
8.5 hrs) guided by criticality of
Tier 3 (<99.9%) Hours to days application data
(Hours to days) – How much data can you lose?
RPO
Tier1
RTO
7. Protection Levels
Regional DR
• Protection against resource failures
– Machine
– Database Corruption
– Disk
• Location Redundancy Geographic DR
– Building
Protection against
– < 10 miles
Natural Disasters
Location Redundancy
Protection against
– State, Country
Network Outages
– > 100-200 miles
Site Failures
Location Redundancy
– City, County
Local HA – < 100-200 miles
8. Business Continuity Planning
• Impact Analysis
– Critical Functions
Analysis – Threat Identification
– Recovery Objectives
• Solution Design
– Achieve recovery objectives for
Solution relevant threats within specified
Maintenance
Design constraints like budget, human
resources etc
– CostBenefit analysis of solutions
• Implementation
– Deploy the recommended solution
• Testing
Implementati – Test to see if the solution meets the
Testing
on recovery requirements
• Maintenance
– Yearly testing and review of
procedures
9. SQL Server High Availability Planning
• Analysis
– Application tiers serviced by the databases
– Causes of database downtime
– Protection levels: Local HA, Regional DR, Geographic DR Analysis
• Solution Design
– Need to understand what solutions exists?
– What are the characteristics and Maintenance Solution Design
cost of the solution?
• Implementation
– What are the deployment steps and best practices?
• Testing Testing Implementation
– How do I test my implementation?
• Maintenance
– How do I monitor and maintain the solution?
11. Solution Design Solution
Design
• Understand the Solution Architecture
solutions and
choices before
HA Capabilities
making a decision
Limitations and Caveats
Cost Vector
12. SQL Server Solution
Design
Always On Technologies
13. Always On Technologies Solution
Design
• Provides a full
range of options to • Backup and Restore
• Log Shipping
minimize downtime Increases • Database Mirroring
and maintain Availability •
•
Failover Clustering
Peer-Peer Replication
appropriate levels
of application
availability
• Online Index Operations
• Table Partitioning
Decreased • Enhanced Locking
• Resource Governor
Downtime • Database Snapshot
• Dedicated Admin Connection
• Dynamic Configuration
14. Always On Technology Overview
Solution
• Architecture Overview Design
– How does it work?
• Backup and Restore
• Solution Characteristics Increases
• Log Shipping
• Database Mirroring
– Data Loss Guarantees Availability • Failover Clustering
– Failover Characteristics • Peer-Peer Replication
– Redundancy Levels
and Utilization
– Cost
– Limitations and Caveats
15. What’s New in SQL Server 2008
• New Features • Feature Enhancements
• Resource Governor • Database Mirroring
– Automatic recovery from
– Manage SQL Server
page corruption
workloads and resources – Log stream compression
by specifying limits on – Faster recovery on failover
resource consumption • Log Shipping
– Sub-Minute Log Shipping
– Backup compression
• Backup Compression • Failover Clustering
– Reduce backup and restore – 16 nodes
time – Rolling upgrade
• Peer-Peer Replication
– Hot add new nodes
17. Backup and Restore Solution
Design
• Base availability technology for any solution
– Protects against failures and recovery from errors
– Provides Local HA and Site DR
• Need to ensure the backups are accessible if site goes down
– High RTO due to restore time
– RPO=0 can never be guaranteed
• Types: Full, Differential, and Transaction Log
– File-group backup/restore for large databases
• Backup Compression provides faster and
smaller backups in SQL Server 2008
18. Enhanced Error Detection
• In SQL Server 2000 RESTORE
VERIFYONLY does not guarantee that the
backup is good
– Data may be corrupt
• In SQL Server 2005 RESTORE
VERIFYONLY checks everything
– Ensures that the data is correct
19. Database Checksums
• SQL Server 2000 had TornPageDetection to
detect incomplete I/O Operations by power
failures
• SQL Server 2005 adds checksums to data
pages
– Header of every page contains a checksum value
– When reading page, it re-computes checksum and
compares with checksum stored
– Returns error (824) if difference found
– Detects errors not reported by I/O Subsystem
20. Backup Checksums
• Detect errors introduced by backup hardware but
not reported by hardware or operating system
– Backup media error detection
– Backup devices do not always detect errors
– Works with
• RESTORE
• RESTORE VERIFYONLY
• Restore also checks page checksums, if present
– Disk error detection on data pages prior to backup
• Can continue past errors if desired
21. Backup Compression
• Common questions: • ―We saw an 85 percent
– ―How much compression will I see?‖
reduction in file size using
– ―Will it be comparable to, say,
SQL Litespeed?‖ SQL Server 2008 Backup
Compression,‖ says Colin
• One simple answer: Neller, Senior Software
―It depends!‖ Engineer at ServiceU and
part of the company‘s SQL
• All data compresses Server 2008 implementation
differently – the compression team. ―A backup file that was
ratio achieved depends on: previously over 300 GB is
– The type of data in the database
– Whether the data in the database is
now only 40 GB, and the job
already compressed runs in about half the time.‖
– Whether the data/database is
encrypted
22. Backup Compression: Backup
Performance
• Backup of a 322 MB Adventureworks database
Uncompressed
Compressed
Hardly any CPU used (avg 5%), A LOT more CPU used (avg 25%)
runtime = 39.5s, compression BUT runtime = 21.6s (45%
ratio of 0. improvement) and backup stored in
76.7MB (4.2x compression ratio)
25. Database Snapshots
• Read-only, consistent view of a Page
database
– Specified point-in-time
• Modifying data
– Copy-on-write of affected pages
• Reading data Page
– Accesses snapshot if data has
changed
– Redirected to original database
12:00 Snapshot
otherwise
26. Using Database Snapshot to Recover
Data
Scenario Example Code / Steps
Undeleting INSERT INTO Production.WorkOrderRouting
rows SELECT * FROM
AdventureWorks_dbsnapshot_1800.Prod.WorkOrderRouting
Undoing UPDATE HR.Department
an update SET Name = ( SELECT Name FROM
AdventureWorks_dbsnapshot_1800.HR.Department
WHERE DepartmentID = 1)
WHERE DepartmentID = 1
Recovering 1 Script the object in the database snapshot
a dropped
object 2 Execute the script in the source database
3 Repopulate the object (if appropriate)
Caution: Not a substitute for a comprehensive backup and restore strategy
29. Log Shipping Solution
Design
• Automated transaction log backup and
restore provides redundancy at the
database level
• SQLLogship.exe provides the underlying
framework for doing automated backup,
copy and restore
– Backup on primary instance
– Restore on secondary instance(s)
• Scheduling is done through
SQL Server Agent jobs
– SQL Server 2008 provides sub-minute
scheduling interval providing the ability
to do quick backup and restores
• No automatic failover capabilities
30. Log Shipping (Key terms)
• Primary Server:
– Contains your primary database.
– SQL Server Agent makes periodic transaction log
backups to capture changes.
• Secondary Server
– Contain an unrecovered copy of the production
database.
– One standby server can contain standby databases
from multiple primary servers.
31. Log Shipping (Key terms) cont…
• Monitor Server (Optional)
– Monitors the status of the log-shipping jobs on the
primary and each standby server.
– One monitoring server can monitor multiple primary-
standby server pairs.
– Should use a server other than the primary or the
standby to detect problems on either server.
33. Strength & weakness
• Strengths
– Can Ship Logs Across WAN (Wide-Area Network)
– Protects an Entire Database
• Weaknesses
– Configured Per Database
– NO AUTOMATIC FAILOVER
36. Database Mirroring Solution
Design
• A database level high availability solution
that provides complete protection
against data loss and fast recovery
through automatic failover
• Maintains a redundant database by
shipping log blocks when the
transactions are committed on the
principal
• Synchronous and Asynchronous
modes provide the spectrum of
options to choose between
availability and performance
• Automatic failover when using
witness server
37. Database Mirroring Modes
• High-Availability Mode
– Safety Full; Synchronous operation
– Database is available whenever a quorum exists
– Automatic failover
• High-Protection Mode
– Safety Full; Synchronous operation
– No witness – quorum provided by partners
– If Principal loses quorum, it stops servicing the database
• Ensures high protection; database is never in ‗exposed‘ state
– Manual failover only; no automatic failover
– A transition mode; should not be in this mode for long
• High-Performance Mode
– Safety Off; Asynchronous operation
– Manual failover only
• Supports only one form of role switching: forced service (with
possible data loss)
38. Database Mirroring
How it works
Mirror is always
Application Witness redoing – it remains
current
Commit
Principal Mirror
1 5
2
SQL Server SQL Server
2 >2 4 3 >3
Log Data Log Data
39. DBM – Automatic Page Recovery
Witness
Client
2. Request page
3. Find page
6. Write 5. Transfer page
1. Bad Page
Page
Detected Log
XData Data Log
Principal 4. Retrieve page Mirror
40. Database Mirroring Enhancements
• Enhancements in SQL 2008
– Compression of stream data for which at least a 12.5
percent compression ratio can be achieved.
– Automatic Recovery from Corrupted Pages.
– Page read-ahead during the undo phase.
– Improved use of log send buffers.
41. Strength & Weakness
• Strengths
– Can Mirror Across WAN
– Automatic Failover, and Nearly Instantaneous, Better
than Failover Clustering
– Protects an Entire Database
• Weaknesses
– Requires Enterprise Edition
– Must be Configured Per Database
44. Replication
• Primarily used where
availability is required in
conjunction with scale out of
read activity
• Failover possible; a custom
solution
• Not limited to entire
database; Can define subset
of source database or tables
• Copy of database is
continuously accessible for
read activity
• Latency between source
and copy can be as low as
seconds
45. Transactional Replication Solution
Design
• A high performance data replication solution that provides
granular table level replication
– Logical data movement provides flexibility and
better hardware utilization
• Key scenarios:
– Customized application-specific DR
– Real-time reporting on secondary server that be used for Site DR
– Scale out application queries with ability to use any one
database copy for Site DR
• Two types relevant for HA and DR
– Transactional and Peer-to-Peer
46. Peer-to-Peer Replication
• Provides high availability Peer Node Peer Node
and read scalability
• Builds redundancy by
eliminating single point of
failure
• Enable online upgrades of
servers
Peer Node Peer Node
• Maximize Application
Uptime
• Support for both Ring and
Grid Topology
• Centralized Management
using Management Studio
47. New Features
Replicated
Data
Write
Load Balancing
Read
Application Server
User
Requests
48. Strength & Weakness
• Strengths
– Perpetual or on-demand replication of data, local or
remote
– Protects (duplicates or merges) the exact portion of the
database I want
• Weaknesses
– Configured per database, even per table
– Generally does not protect or duplicate an entire
Database
51. Failover Clustering Solution
Design
• Instance level protection built on Windows
Failover Clustering shared disk model
– Cluster nodes typically co-located within the
same site to provide local HA
– Regional DR possible using VLAN and stretch
storage level replication
• No built in data redundancy like database
mirroring and log shipping
– Data protection has to be provided at the
storage level or by combining with other solutions
53. SQL Server Cluster Topologies
• Supports many scenarios: Failover Cluster
• Single Instance
• Multiple Instance
* Inst1
• Multiple Active Nodes
• N+1
• N+M
Multiple Active Nodes N+1: N Active, 1 Inactive N+M: N Active, M Inactive
Nodes Nodes
* Inst1 * Inst1
Inst3 *
Inst2 *
Inst2 *
54. Failover Clustering (Facts)
• Redundancy at database instance level
– All databases fail over together
– Shared copy of system databases
• Single data copy on shared storage device
– No I/O overhead reducing throughput
– Storage unit is single point of failure for cluster
• All database services are clustered
– SQL Agent; Analysis Services; Full-Text engine, MS DTC
• Automatic failover (up to minutes)
• DBMS accessed over virtual IP
• Storage is controlled by one cluster node at a time
• Requires hardware certified by Microsoft for Microsoft
Cluster Service
55. Strength & Weakness
• Strengths
– Provides Protection Against a Node Failure, Protects
the Entire SQL Instance
– Automatic Failover Supported
• Weaknesses
– Generally Expensive, Requires Specialty Hardware
– Specialty Hardware Requirements
– Not Trivial to Configure and Manage
– Doesn‘t Protect Against a Complete
Site Failure
57. Best Practices
• Backup your system databases after
modifications.
• Test if backups are restorable.
• Practice / Test your disaster recovery plans.
• Documentation is not only for you.
• Keep dedicated DR Server ready.
• Use BACKUP CHECKSUM features.
• Run DBCC CHECKDB regularly.
• Don‘t ignore any runtime errors.
59. Always On Solution Solution
Design
Characteristics Redundancy and
RPO Failover Utilization Cost
Solutions No Data Failover Unit Auto Read Mult- Write Hard- App Perf Manag-
Loss Failover iple ware Impact eability
(RPO=0) (RTO)
Inst DB Tab
Log Shipping *
Low Low Low
DBM Sync * Low High Low
+ **
Async * Low Low Low
Cluster High*** Low *** Low***
Transactional Low Low High
Replication
Peer-Peer Low Low High
Replication
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
database snapshots respectively
** Database Mirroring provides fastest failover to hot secondary
*** Depends on SAN technology
60. Recap Solution
Design
• Application availability requirements
or SLA drive primary solution choices
– RPO and RTO are the key metrics Application Availability
used to define the SLA
Unplanned
Planned Downtime
• Need mitigation against planned and downtime
unplanned downtimes
• Multiple solution choices that Database
provides varying costbenefits Mirroring
Clustering
• Other requirements apart from
application SLA factor into the choice
Log Peer-Peer
Shipping Replication
• Understand constraints and tradeoffs
you can make
61. Always On Solution Solution
Design
Characteristics Redundancy and
RPO Failover Utilization Cost
Solutions No Data Failover Unit Auto Read Mult- Write Hard- App Perf Manag-
Loss Failover iple ware Impact eability
(RPO=0) (RTO)
Inst DB Tab
Log Shipping *
Low Low Low
DBM Sync * Low High Low
+ **
Async * Low Low Low
Cluster High*** Low *** Low***
Transactional Low Low High
Replication
Peer-Peer Low Low High
Replication
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
database snapshots respectively
** Database Mirroring provides fastest failover to hot secondary
*** Depends on SAN technology
62. AdventureWorks Inc Scenario
Adventureworks Inc is a Solution
manufacturing company that • One datacenter located in Design
manufactures and sells bicycles Omaha
across the world. There are a
number of applications, some • Three applications
that are mission critical that run – Manufacturing – Tier 1
on multiple SQL Server – Finance – Tier 2
Instances
– Scheduling – Tier 3
• The DBA team is run by Darren • Manufacturing application runs
who is responsible for deploying on a dedicated SQL Server
and managing the application 2008 Instance
databases. One of his core – All other applications run on
responsibilities is to ensure a second instance
availability of all application • Availability of manufacturing
databases in order to meet the
application SLA application is critical
• Implement a solution at the
lowest possible cost
63. Application Requirements Solution
Design
Applications Data RTO in Failover Unit Auto Read Multiple Read
Loss secs Failover Sites Write
RPO=0
Inst DB Tab
Manufacturing
Finance
Scheduling
• Manufacturing application has strict SLA‘s
• Finance application requires readability on the secondary
– The reports are run every 4 hours and need to be fresh as of the
last one hour. To offload the reporting load from the main system
they would like to utilize the mirror
64. Solution Choice for Manufacturing
Solution
Application Design
Solutions Data Loss Fast Failover Unit Auto Read >1 Read
RPO=0 RTO Failover Sites Write
Inst DB Tab
Copy
Cluster
SAN Replication
• DBM - Sync a zero data loss solution that can also provide fast instance level
Clustering can provide
failover
• Use RAID configuration to provide data redundancy on the SAN
• If a redundant copy is required that can provide instance failover with zero
data loss use SAN replication
DBM - AsyncSolution
– High Cost
• Use synchronous database mirroring if instance failover is not needed
Log Shipping
Transactional
Replication
Peer-Peer
Replication
Clustering with RAID
65. Solution Choice for Finance Solution
Design
Application
Solutions Data Loss Fast Failover Unit Auto Read >1 Read
RPO=0 RTO Failover Sites Write
Inst DB Tab
Copy
Cluster
SAN Replication
DBM - Sync
DBM - Async
Log Shipping
For database level redundancy with acceptable
data loss with minimal perf impact,
asynchronous database mirroring is an optimal
Transactional
choice
Replication
Peer-Peer
Use database snapshots at periodic intervals to Reports
providea readable
Replication Finance
snapshot of the data for reporting
Scheduling Db Snapshot
Low cost solution Async Database every hour
Mirroring
Omaha Datacenter
66. Adding a Regional Datacenter Into
the Mix Solution
Design
• Regulatory and compliance requirements drive
the need for having a additional datacenter within
a 10 mile radius to provide redundancy against
site level failure.
– It is now required that all applications have the ability
to failover to the regional datacenter across the river in
Council Bluff
• The SLA need to be maintained for tier 1
applications even in the case of site failures
67. Regional Site Solution Solution
Design
Choices
Manufacturing
Cluster with SAN
Sync Mirroring
no witness
Reports
Finance
Scheduling Db Snapshot
Async Database every hour
Mirroring
Log Shipping
Omaha Datacenter CB Datacenter
68. A Complete Topology Solution
Design
• Considering the potential of floods and
tornadoes destroying the regional data centers,
Adventureworks Inc wants to maintain a disaster
recovery site in
San Antonio, TX
• The disaster recovery site has lower SLA
requirements for all applications
– The manufacturing application can have an RPO of 1
hour
– The RTO is set at 4 hours
69. Topology Diagram Solution
Design
Sync Mirroring
Manufacturing No witness
Cluster with SAN
Log Shipping
70. Scale Out and Availability Solution
Design
Scenario Requirements
– Geo Redundancy
• Adventureworks is building
– Data Locality
a new web based order
– High Availability
management system that – Local Read-Scale
allows customers from all
Workload Characteristics
over the world access the
– Mainly reads
system and place orders
– Few writes
• The core group of Application Characteristics
customers are in Western – Each user logging in connects to a
Europe, South East Asia particular server
and North America Partitioned based on user-id and region
Writes from a user always happen on one
server regardless of the region the user log in
from
– All reads redirected to the closest geo-
location
Reasonable tolerance for latency (5-10 minutes)
72. Licensing Facts
• Passive servers are mirror, log
shipped secondary and
clustering passive node
• No license required on passive
if it is truly passive
• A passive server does not need
a license if the number of
processors in the passive server
is equal to or less than the
number of processors in the
active server.
• The passive server can take the
duties of the active server for 30
days. Afterwards, it must be
licensed accordingly.
73. HA Features Edition Support
Feature Express Workgroup Standard Enterprise Comments
Advanced high
availability solution
Database 1 that includes fast
Mirroring failover and
automatic client
redirection
Failover Clustering 2
Backup Log- Data backup and
shipping recovery solution
Includes Hot Add
Memory, dedicated
Online System administrative
Changes connection, and
other online
operations
Online Indexing
Online Restore
Database available
Fast Recovery when undo
operations begin
₁Single thread redo
₂ Limited to 2 node cluster
74. Summary
• There is no ―one size fits all‖ solution
• Consider the costbenefitsconstraints and compare that
to availability requirements of the organization to
determine the best solution
• Use the charts to understand cost, benefit and
constraints of
the various SQL Server High Availability solutions
• TEST the solution to ensure it can meet the availability
requirements and meet SLA‘s
76. SQL Server AlwaysOn:
Mission Critical Capabilities in SQL
Server “Denali”
• Jon Jahren
• Exec VP, Prediktor
• jon.jahren@prediktor.no
77. High Availability and Disaster
Recovery
SQL Server “Denali” AlwaysOn
A
A A
Shared Storage
• Faster failover, easier administration with Availability Groups A
• Identify databases to failover as a unit to reduce unplanned
downtime A
• Faster application failover using virtual name A
A
• Increase application uptime using flexible failover policy
• Enable better data redundancy and protection with up to four Non-Shared Storage
secondaries and up to two synchronous secondaries
• Limited downtime with enhanced online operations A
A
• Run Microsoft SQL Server® on Windows Server® Core to
reduce planned downtime (50-60% fewer OS patch reboots)
Disaster Recovery
78. Maximize Resources
Higher return on high availability investments
• Increase hardware utilization through active secondaries for
backups, reporting, and ad hoc queries
• Reuse existing infrastructure with support for both SAN and
direct attached storage
Simplify management and administration
• Integrated manageability for one-stop configuration
• Easy setup and monitoring integrated into Microsoft SQL
Server Management Studio
• Availability Groups that provide failover units with contained
dependencies (such as logons)
79. 80
Breakthrough Performance and Scale
• Dramatically faster star-join query processing—
much faster than current SQL Server (~10X)
• Query speed increase varies with query and data 110010100
101001010
011101011
00101001
• Reduced I/O
• Consistent query performance
• Reduced performance tuning effort
•
80. Mission Critical High Availability Solution
Meets
mission
critical high Integrated Flexible Efficient
availability
SLA
Microsoft recommended prescriptive HA solutions and
customer references 81
81. Introducing SQL Server
AlwaysOn
Integrated, Flexible, Efficient high Availability for
mission critical business
A high availability platform built for the future
AlwaysOn provides database level and instance
level protection
AlwaysOn Availability Groups AlwaysOn Failover Cluster Instances
for database protection for instance level protection
Multi-Database Failover Multisite Clustering
Multiple Secondaries Flexible Failover Policy
Active Secondaries Improved Diagnostics
Integrated HA Management Built for consolidation scenarios
82. AlwaysOn – A flexible solution
AlwaysOn provides the flexibility of different HA
configurations
A
A
A
A A
A
A
Direct attached storage local, regional and geo target Shared Storage, regional and geo secondaries
Synchronous Asynchcronous
Data Movement Data Movement
83
83.
84. AlwaysOn Availability Groups
AlwaysOn Availability Groups is a new feature that enhances and
combines database mirroring and log shipping capabilities
Flexible Integrated Efficient
Multi-database failover Application Active
Multiple secondaries failover using Secondary
Total of 4 secondaries virtual name Readable
2 synchronous Configuration Secondary
secondaries Wizard Backup from
1 automatic failover pair Secondary
Dashboard
Synchronous and
System Center Automation
asynchronous
Integration using power-
data movement
Rich diagnostic shell
Built in compression infrastructure
and encryption
File-stream
Automatic and manual replication
failover
85. Availability Groups Virtual Name
Availability Groups Virtual Name allow applications to failover seamlessly to any secondary
– Application reconnects using a virtual name after a failover to a secondary
ServerA ServerB ServerC
HR HR HR
DB DB DB
AG_HR
HR_VNN
Primary Secondary
Primary Secondary Secondary
Application retry during failover
Connect to new primary once
-server HR_VNN;-catalog failover is complete
HRDB and the virtual name is online
87. What about Server Objects?
Introducing Contained
Databases or CDB‘s
Unit of application
programmability in Denali
A DB which establishes a
boundary between application
and server Authentication information
moves with the CDB
CDBs sever the user–login
relationship
Windows users no longer need
matching logins
Users with passwords replace
SQL logins
88. Databases are not always easy to move
Master MSDB
Master MSDB
Instance Collation Agent
Logins Replication
Credentials DB Mail
Linked Server Defs. …
CLR User DB
…
…
TempDB Collation
Other DBs
Other Apps
TempDB User DB
Temp
89. Introducing the Contained Database
• New database option – CONTAINMENT
• Only option supported in Denali is PARTIAL meaning,
non-enforaced containment
• Partially contained databases solve problems
related to:
• Logins: Database Users with passwords or mapped
directly to Windows principles
• System Collation: Temp tables use the database‘s
collation
• sys.dm_db_uncontained_entities will display all
potential containment breaches
90. Availability Group Architecture
Windows Server Failover Cluster
Database Database
Active Log Synchronization Active Log Synchronization
Availability Group uses Windows WSFC Common Microsoft Availability
Server Failover Cluster (WSFC) for Platform
Inter-node health detection, SQL Server AlwaysOn Failover cluster
instances
Failover coordination,
SQL Server AlwaysOn Availability Group
Primary health detection,
Microsoft Hyper-V
Distributed data store for
settings and state, Microsoft Exchange
Distributed change Built-in WSFC workloads (e.g. file share,
notifications NLB, etc) and third party workloads
91. AlwaysOn Availability Group
Instance Preparation
1. Install WSFC on each machine and create a single WSFC cluster
2. Install SQL Server Instances on each machine
3. Enable AlwaysOn through SQL Configuration Manager
4. CREATE ENDPOINT on each instance
• Notes:
– Steps 1 and 2 can occur in any order (except for AlwaysOn Failover Cluster
Instance (FCI) installation which of course requires WSFC installed)
92. WSFC Cluster vs. SQL Server “Cluster”
Setup
• Install WSFC feature
• Setup WSFC cluster
• Configure SAN and Shared Disks
• Install SQL Server Failover Cluster Instance (FCI):
– Specify resource group
– Select shared disks
– Configure virtual IPs
– Configure virtual network names
– Specify domain accounts for services
– Configure domain groups*
94. Availability Group Concepts Recap
• Availability Group
– Defines the high availability requirements
• Databases, Replicas, Availability Mode,
Failover Mode etc
• Availability Replica
– SQL Server Instances that are part of the
availability group which hosts the physical
copy of the database
– Role: Primary, Secondary, Resolving
• Availability Database
– SQL Server database that is part of an
availability group
– This can be a regular database or contained
database
95. Availability Group Architecture Drilldown
Client connections transparently redirected
to primary via IP and network name User tells SQL to failover Availability Group 2 to Node1
resources
Clients disconnected from AG2
SQL Server Instance SQL Server Instance SQL Server Instance
Availability Group 1
Availability Group 2
Secondaries request
primary connection
WSFC tells WSFC tells
Notification
AG Res DLL SQL confirms AG Res DLL Notification
of new
to bring AG2and tells to bring AG2 of new
primary
online WSFC offline primary
AG Res DLL AG Res DLL AG Res DLL
WSFC Service WSFC Service WSFC Service
96. Active Secondary – Making Secondary
Readable
SQLservr.exe SQLservr.exe
Primary Secondary
InstanceA
Secondary Primary
InstanceB
DB1 DB2 DB1 DB2
Reports Reports
Readable secondary allow offloading read queries to secondary
Close to real-time data, latency of log synchronization impact data freshness
Read applications can reconnect to another secondary on failover
Not a replacement for replication scenarios
97. Active Secondary: Enabling Backup
On Secondary
R/W workload
Backups can be done on
any replica of a database
Backups
Secondary replica may be
synchronous or
Secondary asynchronous
Backups on primary replica
still works
Backups
Log backups done on all
replicas form a single log
Primary Backups chain
Recovery Advisor makes
Secondary restores simple
98. Readable Secondary Latency
Primary Secondary
Log Network Log
Capture Apply
DB1 DB1
Commit
Redo
Log Log Thread
Cache Cache
Log Log Redo
Pages
Flush Harden
DB1 Acknowledge DB1
DB1 Log DB1 Log Page
Data Data
Updated
Commit
• Updated data is visible on the readable secondary as and when the
page is redone
Redo happens asynchronously after log hardening on the secondary
99. Readable Secondary Behavior
• Contention between redo thread and query thread
avoided by
– Internally mapping read workload to non blocking isolation levels
• Read Uncommitted Snapshot Isolation
• Read Committed Snapshot Isolation
• Repeatable Read Snapshot Isolation
• Serializable Snapshot Isolation
– Ignore all locking hints
• Maintains query performance on secondary compared to
primary
– Auto-create statistics on the secondary replica but persist them in
TempDB
101. Key Enhancements
Flexible Failover
Fast instance failover through Policy
predictable database recovery • Eliminates false failover
time • Configurable failure
condition levels
• Better diagnostics
Native support for multi-site
clustering across subnets SMB support
enables consolidation
enable DR using failover
of more than 26
cluster instances instances
Support TEMPDB
on local drive
102. AlwaysOn Failover Cluster Instance
• AlwaysOn Failover Cluster Instance provides
instance level failover
• Key Enhancements
– Multi-site clustering across subnets
– Flexible Failover Policy
– Improved system diagnostics
– Support for network attached storage (NAS) using
SMB
– Support for tempdb on local drive
103. Multi-Site Clustering
• Multi-site clustering provides protection from site failures
• AlwaysOn Failover Cluster Instance natively supports multi-site
clustering without requiring V-LAN
– Each site can have separate IP subnet
– DNS entry updated to reflect current IP address on failover
104. Flexible Failover Policy
User sets new Cluster properties
HealthCheckTimeout and FailureConditionLevel
• FailureConditionLevel (0 to 5):
– 5 – Failover or restart on any qualified failure
conditions
– 4 – Failover or restart on moderate SQL Server
errors
SQL Server Failover
– 3 – Failover or restart on critical SQL Server
Cluster Instance errors
– 2 – Failover or restart on SQL Server
unresponsive
Diagnostics generated – 1 – Failover or restart on SQL Server down
for Health State – 0 – No Automatic Failover or restart
Components
• System
• Resource • Diagnostics returned regardless of
• Query Processing
• IO Subsystem
FailureConditionLevel
• Events
• All levels optimized to minimize false
failures
Diagnostics exec sp_server_diagnostics
(periodically returned)
FCI Res DLL
IsAlive/ LooksAlive IsAlive /LooksAlive
result based on WSFC asks Res
diagnostics and DLL if
WSFC Service
FailureConditionLevel SQL FCI alive
105. Reducing Planned Downtime
Support for Windows Server Core
Reduce OS patching by as much as 50-60%
Support for rolling upgrade and patching of SQL Server
for both Availability Groups and Failover Cluster Instance
Fast failover time for both Availability Groups and Failover
Cluster Instances
New online operations supported
LOB Index
Adding of column with default
107. Flexible Solution Choices
AlwaysOn AlwaysOn AlwaysOn
Availability Failover Cluster Multi-site Failover Cluster
Groups Instances Instances
Optionally combine with
Availability Groups
for DR
108. Virtualization with AlwaysOn Guidance
Virtualization provides best consolidation isolation
Virtualization without AlwaysOn:
Simplest management story for limited HA/DR:
Planned Unplanned
Host Live Migration VM failover (OS restart)
Guest Downtime during No protection from
patch virtualization
When to use AlwaysOn for the guest:
Need better HA/DR protection than standalone VM
109. Available Now – CTP1
• SQL Server Code Name Denali CTP1 is now public
• CTP1 has the following feature set that you can test and provide
feedback
– AlwaysOn Failover Cluster Instance Features are RTM Quality:
• Multi-Subnet Failover
• Flexible Failover Policy
– AlwaysOn Availability Groups Preview
• Ability to configure availability groups through T-SQL, SSMS, and PowerShell
• Multiple databases support in availability groups
• Read-only access to the secondary
• Support for Filestream data type
• Manually failing over and resynchronizing without reseeding
• Failing over client connections using the new connectivity story based on virtual
network names and virtual IP addresses
• Including logins in user databases through a Contained Database
• SSMS, Catalog Views, and DMVs to view and monitor state
• Support for multiple availability groups on the same instance
• Support for availability groups on standalone instances and/or failover cluster
instances
110. Conclusion
• SQL Server AlwaysOn is a • SQL Server AlwaysOn Availability Group
comprehensive high availability – Multi-database failover
solution – Multiple secondaries
– Synchronous and asynchronous data
– Better application availability, movement
– Higher return on investment and – Built in compression and encryption
– Simplified deployment and – Automatic and manual Failover
management – Flexible failover policy
– Automatic Page Repair
– Readable secondary
• AlwaysOn Availability Group and – Secondary backup
AlwaysOn Failover Cluster Instance – Automatic application redirection using
provide flexibility in HA configuration virtual name
– Configuration Wizard
– AlwaysOn Dashboard
• Windows Server Core support – System Center Integration
significantly reduces downtime due – Automation using power-shell
to patching – Rich diagnostic infrastructure
• SQL Server AlwaysOn Failover Cluster
Instance
– Multi-site clustering across subnets
– Flexible Failover Policy
– Improved system diagnostics
– Support for network attached storage
(NAS) using SMB
– Support for tempdb on local drive
111. AlwaysOn Resources
―Denali‖ AlwaysOn Resource Center: http://msdn.microsoft.com/en-
us/sqlserver/gg490638(en-us,MSDN.10)
CTP download
Documentation
MSDN forums
Microsoft Connect
AlwaysOn Blog
Credits :
Vinod Kumar
Balmukund Lakhani
Matt Hollingsworth
Jon Jahren