SlideShare una empresa de Scribd logo
1 de 9
High Availability Microsoft SQL
     Server Database Architecture
VM HA and Symantec Application Availability vs. Microsoft Clustering


                                                   February 2012
The Problem
Difference of opinion building a “High Availability” database environment.



    Infrastructure Team prefers: VMware High Availability
                                 & Symantec ApplicationHA



    Architecture Team prefers:   Microsoft Failover Clustering
Factors driving the difference of opinion
            Infrastructure Team                                                Architecture Team
•   Prefer VM HA / ApplicationHA because, out of the box, it    •   Prefer MS Clustering because it is well integrated at
    provides high availability without the cost or complexity       application level and industry best practice
    of traditional clustering solutions
                                                                •   Unfamiliar with VMware HA and Symantec
•   Unfamiliar with MS Clustering Services                          ApplicationHA

•   Restricts use of VMotion dynamic scaling. Moving            •   Concerned that ApplicationHA will not recognize all
    Clustered Applications between Blades will require Guest        circumstances that cause application unavailability
    OS Downtime
                                                                •   Undefined scripting effort required for application
•   Clustering adds complexity to backup procedures                 monitoring with VM HA and continuing M&O will be
                                                                    required to support scripts

                                                                •   Concerned that VM HA and present M&O support will
                                                                    not deliver required solution availability during hours of
                                                                    operation
HA Drivers (subset)
                                                                         Availability
Clinical Application                     Business Days   Hours of Use   Requirements
Appeals Tracking                            7 Days       0700 - 1900       99.999
Document Management System                  7 Days       0600 - 1800       99.999
SharePoint                                  7 Days       0700- 1900        99.99
Clinical Operations Review System           5 Days       0800 - 1700       99.999
Dental Imaging                                  Clustering Mandated by Vendor
Dictation and Transcription                 7 Days        24 Hours         99.99
Digital Signature                           7 Days        24 Hours         99.99
Information Portal                          7 Days        24 Hours         99.99
Radiology Information System                    Clustering Mandated by Vendor


  Business days, hours of use, and availability requirements were
  obtained from available business requirements documents and
  verbally from user leadership.
Microsoft Clustering
 Pros
 •   Supports application level awareness
 •   Will survive a single node OS system crash
 •   Redundant Node in the event of a SQL Node failure
 •   Minimizes downtime
 •   Permits an automatic response to a failed server or software (no human intervention)
 •   Supports upgrades without forcing users off the system for extended periods of time
 •   Applications connected to SQL remain available while maintenance/patching is
     performed on the redundant Node
 •   Doesn’t require any servers to be renamed - when failover occurs, it is transparent to
     end-users
 •   Faster recovery during HA events i.e.. Node BSOD, SQL connection or authentication
     failures
 •   Failing back is quick, and can be done once the primary server if fixed and put back on-
     line
 •   Is a Microsoft supported solution
 •   Works without snapshots
Microsoft Clustering
 Cons
 •   Additional Cost to deploy and maintain the redundant Nodes
 •   Potential added environment cost for active/passive implementations
 •   Decreased use of VM functionality (no VMotion…)
 •   Added implementation and management complexity
 •   Requires more experienced DBAs and network administrators
 •   Complexity added to SQL and VMware environment
 •   Any HA event requires server admin and or DBA interaction, anywhere from Node reboot to
     rebuild [not self healing]
 •   In a situation where both Nodes have failed recovery time may be greatly increased due to
     the added complexity
 •   No Snapshot or Full Virtual Machine backup option available – a Node or Cluster loss could
     require a rebuild (RTO=days not hours) – This is a wash, backup / recovery options exactly
     the same for HA vs. Clustering due to SQL not supporting snapshots
 •   VMware Host patching/maintenance would have to be done after hours and would require
     DBA participation - Would potentially require a DBA, would NOT require after hours
     (failover can be forced)
 •   VMware Functionality is reduced for all Clustered SQL Nodes i.e.. Snapshot, vMotion, DRS,
     Storage DRS, Storage vMotion – Snapshotting not supported
VM HA + ApplicationHA
     Pros
     •   Eliminates the need for dedicated standby hardware and the installation of additional
         software
     •   Less infrastructure implementation effort
     •   Supports full range of VM functionality (leads to maximized resource utilization)
     •   Reduced implementation and management complexity
     •   Application agnostic
     •   Reduced Cost due to the fact that no redundant Node is necessary for HA
     •   Reduced Complexity for SQL and VMWare environments – This is not accurate, if you
         add in the Symantec ApplicationHA, at best this is a wash, at worst you’ve create a
         new development M&O project which is infinitely more complex than additional
         hardware.
     •   In a situation where the SQL Server has failed entirely recovery time is much shorter
         since we will leverage a complete Virtual Machine recovery option through Symantec
         NetBackup (RTO=minutes or hours) – See note before, this is a wash, bare metal
         recovery will be required in either situation since snapshots aren’t supported.
     •   VMWare Host patching/maintenance could be accomplished without after hour
         maintenance windows or DBA participation
     •   In many HA events i.e.. SQL connection or authentication failures, Application HA can
         take action against individual Windows and SQL components eliminated reboot as the
         only option for resolution [self healing] – This concept of self healing vs. not self healing
         is a red herring, if the server dies and anything except a reboot is required, neither
         setup is “self healing”
     •   Full VMWare Functionality can be realized for the SQL Servers i.e.. Snapshot, vMotion,
         DRS, Storage DRS, Storage vMotion - Again, snapshots not supported
VM HA + ApplicationHA
     Cons
     •   Added application dev implementation effort to support application awareness,
         and continuing M&O (additional coverage required)
     •   Added complexity – multiple components of HA solution
     •   OS crash will result in down time and requires human intervention
     •   If VMHA fails to recognize system crash, human intervention is required
     •   Added application dev implementation effort to support application awareness,
         and continuing M&O (additional coverage required)
     •   Requires snapshotting
            – (Snapshotting of SQL and SharePoint is not supported by Microsoft due to
                data corruption issues)
     •   Some HA events may require the Server to be restarted which could take
         approximately 30-60 seconds i.e.. BSOD, SQL connection or authentication
         failures that Application HA was not able to resolve
     •   Applications connected to SQL are not available while maintenance/patching is
         performed on the SQL Server during scheduled maintenance windows. If
         something happens to the server during patching, full recovery must be
         executed before service availability returns.
     •   To adhere to VMware recommended best practices to achieve true HA, a hot
         standby database server with SQL Server running and replication established
         between the two databases, must be running. In the event of a failure, an
         application developer must manually redirect the application. This is added
         DBA complexity and added reliance on AppDev
Business Sets Availability Requirements!
      Availability            Downtime                  Downtime
     90% (1-nine)            36.5 days/year
     99% (2-nines)           3.65 days/year
     99.9% (3-nines)         8.76 hours/year           10 minutes/week
     99.99% (4-nines)        52 minutes/year           1 minute/week
     99.999% (5-nines)       5 minutes/year            6 seconds/week
     99.9999% (6-nines)      31 seconds/year !
Need to determine if availability is measured:

1)   During operational time (i.e. expected use) which does not included schedule
     maintenance windows

2)   On a 24 hr basis which includes non-operational time

Más contenido relacionado

Destacado

10223-60712_CLS Campaign Report 2015-proof
10223-60712_CLS Campaign Report 2015-proof10223-60712_CLS Campaign Report 2015-proof
10223-60712_CLS Campaign Report 2015-proof
Sarah Palermo
 

Destacado (14)

Always on in SQL Server 2012
Always on in SQL Server 2012Always on in SQL Server 2012
Always on in SQL Server 2012
 
Compare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL ServerCompare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL Server
 
Application HA in Virtual Environments
Application HA in Virtual EnvironmentsApplication HA in Virtual Environments
Application HA in Virtual Environments
 
J2 ee คืออะไร
J2 ee คืออะไรJ2 ee คืออะไร
J2 ee คืออะไร
 
Administering Database - Pengenalan DBA dan Konfigurasi SQL Server 2005
Administering Database - Pengenalan DBA dan Konfigurasi SQL Server 2005Administering Database - Pengenalan DBA dan Konfigurasi SQL Server 2005
Administering Database - Pengenalan DBA dan Konfigurasi SQL Server 2005
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
Migrate Microsoft Access to SQL Server
Migrate Microsoft Access to SQL ServerMigrate Microsoft Access to SQL Server
Migrate Microsoft Access to SQL Server
 
Facebook
FacebookFacebook
Facebook
 
STUDY OF EFFECT OF CONDENSING COVER MATERIALS ON THE PERFORMANCE OF A SOLAR S...
STUDY OF EFFECT OF CONDENSING COVER MATERIALS ON THE PERFORMANCE OF A SOLAR S...STUDY OF EFFECT OF CONDENSING COVER MATERIALS ON THE PERFORMANCE OF A SOLAR S...
STUDY OF EFFECT OF CONDENSING COVER MATERIALS ON THE PERFORMANCE OF A SOLAR S...
 
Building Energy 2014: PV and SHW Design basics by Fortunat Mueller
Building Energy 2014: PV and SHW Design basics by Fortunat MuellerBuilding Energy 2014: PV and SHW Design basics by Fortunat Mueller
Building Energy 2014: PV and SHW Design basics by Fortunat Mueller
 
Resume v2.1.3.2
Resume v2.1.3.2Resume v2.1.3.2
Resume v2.1.3.2
 
Jornadas Internacionales sobre Género y Discurso Político
Jornadas Internacionales sobre Género y Discurso PolíticoJornadas Internacionales sobre Género y Discurso Político
Jornadas Internacionales sobre Género y Discurso Político
 
10223-60712_CLS Campaign Report 2015-proof
10223-60712_CLS Campaign Report 2015-proof10223-60712_CLS Campaign Report 2015-proof
10223-60712_CLS Campaign Report 2015-proof
 
A Simple Trick to Tackle Permissions with Ease - Brandon Bernard Jones
A Simple Trick to Tackle Permissions with Ease - Brandon Bernard JonesA Simple Trick to Tackle Permissions with Ease - Brandon Bernard Jones
A Simple Trick to Tackle Permissions with Ease - Brandon Bernard Jones
 

Último

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Último (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 

Microsoft SQL Server Clustering vs. VMware HA

  • 1. High Availability Microsoft SQL Server Database Architecture VM HA and Symantec Application Availability vs. Microsoft Clustering February 2012
  • 2. The Problem Difference of opinion building a “High Availability” database environment. Infrastructure Team prefers: VMware High Availability & Symantec ApplicationHA Architecture Team prefers: Microsoft Failover Clustering
  • 3. Factors driving the difference of opinion Infrastructure Team Architecture Team • Prefer VM HA / ApplicationHA because, out of the box, it • Prefer MS Clustering because it is well integrated at provides high availability without the cost or complexity application level and industry best practice of traditional clustering solutions • Unfamiliar with VMware HA and Symantec • Unfamiliar with MS Clustering Services ApplicationHA • Restricts use of VMotion dynamic scaling. Moving • Concerned that ApplicationHA will not recognize all Clustered Applications between Blades will require Guest circumstances that cause application unavailability OS Downtime • Undefined scripting effort required for application • Clustering adds complexity to backup procedures monitoring with VM HA and continuing M&O will be required to support scripts • Concerned that VM HA and present M&O support will not deliver required solution availability during hours of operation
  • 4. HA Drivers (subset) Availability Clinical Application Business Days Hours of Use Requirements Appeals Tracking 7 Days 0700 - 1900 99.999 Document Management System 7 Days 0600 - 1800 99.999 SharePoint 7 Days 0700- 1900 99.99 Clinical Operations Review System 5 Days 0800 - 1700 99.999 Dental Imaging Clustering Mandated by Vendor Dictation and Transcription 7 Days 24 Hours 99.99 Digital Signature 7 Days 24 Hours 99.99 Information Portal 7 Days 24 Hours 99.99 Radiology Information System Clustering Mandated by Vendor Business days, hours of use, and availability requirements were obtained from available business requirements documents and verbally from user leadership.
  • 5. Microsoft Clustering Pros • Supports application level awareness • Will survive a single node OS system crash • Redundant Node in the event of a SQL Node failure • Minimizes downtime • Permits an automatic response to a failed server or software (no human intervention) • Supports upgrades without forcing users off the system for extended periods of time • Applications connected to SQL remain available while maintenance/patching is performed on the redundant Node • Doesn’t require any servers to be renamed - when failover occurs, it is transparent to end-users • Faster recovery during HA events i.e.. Node BSOD, SQL connection or authentication failures • Failing back is quick, and can be done once the primary server if fixed and put back on- line • Is a Microsoft supported solution • Works without snapshots
  • 6. Microsoft Clustering Cons • Additional Cost to deploy and maintain the redundant Nodes • Potential added environment cost for active/passive implementations • Decreased use of VM functionality (no VMotion…) • Added implementation and management complexity • Requires more experienced DBAs and network administrators • Complexity added to SQL and VMware environment • Any HA event requires server admin and or DBA interaction, anywhere from Node reboot to rebuild [not self healing] • In a situation where both Nodes have failed recovery time may be greatly increased due to the added complexity • No Snapshot or Full Virtual Machine backup option available – a Node or Cluster loss could require a rebuild (RTO=days not hours) – This is a wash, backup / recovery options exactly the same for HA vs. Clustering due to SQL not supporting snapshots • VMware Host patching/maintenance would have to be done after hours and would require DBA participation - Would potentially require a DBA, would NOT require after hours (failover can be forced) • VMware Functionality is reduced for all Clustered SQL Nodes i.e.. Snapshot, vMotion, DRS, Storage DRS, Storage vMotion – Snapshotting not supported
  • 7. VM HA + ApplicationHA Pros • Eliminates the need for dedicated standby hardware and the installation of additional software • Less infrastructure implementation effort • Supports full range of VM functionality (leads to maximized resource utilization) • Reduced implementation and management complexity • Application agnostic • Reduced Cost due to the fact that no redundant Node is necessary for HA • Reduced Complexity for SQL and VMWare environments – This is not accurate, if you add in the Symantec ApplicationHA, at best this is a wash, at worst you’ve create a new development M&O project which is infinitely more complex than additional hardware. • In a situation where the SQL Server has failed entirely recovery time is much shorter since we will leverage a complete Virtual Machine recovery option through Symantec NetBackup (RTO=minutes or hours) – See note before, this is a wash, bare metal recovery will be required in either situation since snapshots aren’t supported. • VMWare Host patching/maintenance could be accomplished without after hour maintenance windows or DBA participation • In many HA events i.e.. SQL connection or authentication failures, Application HA can take action against individual Windows and SQL components eliminated reboot as the only option for resolution [self healing] – This concept of self healing vs. not self healing is a red herring, if the server dies and anything except a reboot is required, neither setup is “self healing” • Full VMWare Functionality can be realized for the SQL Servers i.e.. Snapshot, vMotion, DRS, Storage DRS, Storage vMotion - Again, snapshots not supported
  • 8. VM HA + ApplicationHA Cons • Added application dev implementation effort to support application awareness, and continuing M&O (additional coverage required) • Added complexity – multiple components of HA solution • OS crash will result in down time and requires human intervention • If VMHA fails to recognize system crash, human intervention is required • Added application dev implementation effort to support application awareness, and continuing M&O (additional coverage required) • Requires snapshotting – (Snapshotting of SQL and SharePoint is not supported by Microsoft due to data corruption issues) • Some HA events may require the Server to be restarted which could take approximately 30-60 seconds i.e.. BSOD, SQL connection or authentication failures that Application HA was not able to resolve • Applications connected to SQL are not available while maintenance/patching is performed on the SQL Server during scheduled maintenance windows. If something happens to the server during patching, full recovery must be executed before service availability returns. • To adhere to VMware recommended best practices to achieve true HA, a hot standby database server with SQL Server running and replication established between the two databases, must be running. In the event of a failure, an application developer must manually redirect the application. This is added DBA complexity and added reliance on AppDev
  • 9. Business Sets Availability Requirements! Availability Downtime Downtime 90% (1-nine) 36.5 days/year 99% (2-nines) 3.65 days/year 99.9% (3-nines) 8.76 hours/year 10 minutes/week 99.99% (4-nines) 52 minutes/year 1 minute/week 99.999% (5-nines) 5 minutes/year 6 seconds/week 99.9999% (6-nines) 31 seconds/year ! Need to determine if availability is measured: 1) During operational time (i.e. expected use) which does not included schedule maintenance windows 2) On a 24 hr basis which includes non-operational time