SlideShare una empresa de Scribd logo
1 de 39
Fail-Safe Cluster for FirebirdSQL
and something more
Alex Koviazin, IBSurgeon
www.ib-aid.com
• Replication, Recovery and
Optimization Firebird since 2002
• In Brazil since 2006 (Firebase)
• Platinum Sponsor of Firebird
Foundation
• Based in Moscow, Russia
www.ib-aid.com
Agenda
• In general - what is fail-safe cluster?
• What we need to build fail-safe cluster?
• Overview of Firebird versions with high-availability
• Trigger-based and native replication as basis for fail-safe
cluster
• How HQbird Enterprise implements fail-safe cluster
• Alternatives of fail-safe cluster: Mirror/Warm Standby
• Special offer for FDD visitors
What is a fail-safe cluster?
User
1
User
2
User
3
User
N
User
N
What is a fail-safe cluster internally?
User
1
User
2
User
3
User
N
User
N
Firebird master Firebird replica
What is a fail-safe cluster?
User
1
User
2
User
3
User
N
User
N
Firebird master Firebird replica
Synchronization
When master fails, users reconnect to
new master
User
1
User
2
User
3
User
N
User
N
Firebird master Firebird replica
Synchronization
For Fail-Safe Cluster we need
1. At Server Side
• Synchronization of master and replica
• Monitoring and switching mechanism
2. At Application Side
• Reconnection cycle in case of
disconnects
• web is good example
Database fail-safe cluster is NOT:
1) Automatically deployed solution
• Complex setup (2-3 hours of work)
2) True scalable solution
3) Multi-master replication
• Replicas are read only
4) Geographically distributed solution
• Fast network connection required
How Fail-Safe Cluster Work: Applications
From the application point of view
1. End-user applications connect to the
Firebird database, work as usual
2. If Firebird database becomes unavailable,
end-user applications try to reconnect to
the same host (correctly, no error
screens)
3. End-user applications should better use
caching (CachedUpdates) to avoid loss of
uncommitted data
How Fail-Safe Cluster Work: Server
Firebird master
Monitoring
Firebird replica
Monitoring
1.All nodes monitor each other: replica watches
for master
2.If master fails, replica
a) make “confirmation shoot” to master
b) Changes itself as master – script + DNS
After replica fails, there is no
cluster anymore!
Just a single database!
Minimal fail-safe cluster
Firebird master
Monitoring
Firebird replica
1
Monitoring
Firebird replica
2
Monitoring
Minimal fail-safe cluster
Firebird master
Monitoring
Firebird master
2
Monitoring
Firebird replica
2
Monitoring
What is fail-safe cluster good for?
• For web-applications
• Web caches data intensively
• Tolerant to reconnects
• Many small queries
• For complex database systems with
administrator support
HOW TO CREATE FAIL-SAFE
CLUSTER IN FIREBIRD
2 main things to implement fail-safe
cluster in Firebird
1. Synchronization between master and
replica
2. Monitoring and switching to the new
server
Synchronization
Native
• Suitable for high-load
• Easy setup
• Strict master-slave, no
conflicts
• DDL replication
• Very low delay in
synchronization
(seconds)
Trigger-based
• <200 users
• Difficult setup
• Mixed mode, conflicts
are possible
• No DDL replication
• Delay is 1-2 minutes
Comparison of Firebird versions with
replication
HQbird RedDatabase Avalerion Firebird
Status (July 16) Production Production Beta Alpha
Supported Firebird version 2.5, 30 2.5 3.0 4.0
ODS 11.2, 12.0 11.3 ? 13.0
Compatibility with Firebird 100% Backup -
Restore
? 100%
Synchronous Yes Yes No Yes
Asynchronous Yes Yes Yes Yes
Price per master+replica R$1990
R$1100
(FDD only!)
~US$3500 ? Free
(in 2018)
http://ib-aid.com/fdd2016
Native replication in HQbird
Firebird master Firebird replica
1. No Backup/restore needed
2. Sync and Async options
3. Easy setup (configuration only)
Sync: ~1-3 seconds
Async: 1 minute
Steps to setup replication
SYNCHRONOUS
1. Stop Firebird
2. Copy database to
replica(s)
3. Configure replication
at master
4. Configure replication
at replica
5. Start Firebird at master
and replica
ASYNC
1. Stop Firebird
2. Copy database to the
same computer
3. Configure replication
at master
4. Start Firebird
5. Copy and configure at
replica(s)
Configuration for replication
Replication.conf
Sync
<database c:datatest2.fdb>
replica_database sysdba:masterkey@replicaserver:/data/testrepl2.fdb
</database>
Async
<database C:dbwmaster.fdb>
log_directory d:dblogs
log_archive_directory D:dbarchlogs
log_archive_command "copy $(logpathname) $(archpathname)"
</database>
Downtime to setup replication
• Synchronous
• Time to copy database between servers + Time to setup
• Asynchronous
• Time to copy database at the same server
• Suitable for geographically distributed systems
ASYNC is much faster to setup!
Monitoring and switching from master to
replica
• HQbird (Windows, Linux)
• Pacemaker (Linux only)
• Other tools (self-developed)
Elections and Impeachment
• Not about presidents elections 
• Replicas must select new master after old master
failure
Priority list
MASTER1
REPLICA1
REPLICA2
REPLICA3
REPLICA4
Algorithm
1. We have priority list: master, replica1, replica 2
2. Each node watches for all other nodes
3. If master node fails, the first available replica
takes over the
1. Disables original master – stop Firebird server
2. Changes DNS of replica to master
3. Changes other servers configurations
4. Restart Firebird server at new master to changes take
effect
5. Send notifications
After failure – re-initialization
• Downtime needed!
• Stop all nodes
• Copy database to the new/reinitialized node
• Reconfigure replication configuration
• Start Firebird at all nodes
Too complex?
Fail-Safe Cluster is a custom solution
• HQbird gives options, you decide how to
implement them
• Requires careful planning and implementation
• Requires high-quality infrastructure
• Requires 2+ replicas
• Network outage in 10 seconds will cause all replicas to
degrade!
• High-speed network and high speed drives
Do we really need cluster and
high availability?!
Why we REALLY need high-availability
• If database size is big (20-100Gb)
• Automatic recovery with FirstAID is very slow
(downtime > 8 hours) or impossible (>100Gb)
• Manual recovery is expensive (US$1000+) and
slow (1 day+)
• Big database = Big company
• Losses are not tolerated
• Downtime is very expensive ($$$)
Mirror (warm-standby) – easy to setup,
reliable mirror (one or more)
Firebird master
Monitoring
Firebird replica
Monitoring
Replica
segment 1
Replica
segment 2
Replica
segment N
Based on async replication
Can be geographically distributed (FTP,
Amazon, cloud backup)
Firebird master
Monitoring
Firebird replica
Monitoring
Replica
segment 1
Replica
segment 2
Replica
segment N
Licenses
HQbird Enterprise licenses needed for
FAIL-SAFE CLUSTER
Master+replica + N*replicas*0.5
For minimal cluster = 1 master and 2 replicas:
1 HQbird Enterprise + 0.5*HQbird Enterprise
MIRROR (Warm-Standby) = Master+replica
1 HQbird Enterprise
Thank you!
Solution
Changes are buffered per transaction, transferred in
batches, synchronized at commit
Follows the master priority of locking
Replication errors can either interrupt operations or just
detach replica
Replica is available for read-only queries (with caveats)
Instant takeover (Heartbeat, Pacemaker)
• Issues
Additional CPU and I/O load on the slave side
Replica cannot be recreated online
Synchronous replication
Txn 1 Txn 2 Txn N
Buffer 1
Txn 1 Txn 2 Txn N
Master DB
Replica DB
Buffer 2 Buffer N
Synchronous replication
Solution
Changes are synchronously journalled on the master side
Journal better be placed on a separate disk
Journal consists of multiple segments (files) that are rotated
Filled segments are transfered to the slave and applied to
the replica in background
Replica can be recreated online
Issues
Replica always lags behind under load
Takeover is not immediate
Read-only queries work with historical data
Asynchronous replication
Txn 1 Txn 2 Txn N
Buffer 1
Master DB
Buffer 2 Buffer N
Journal
Archive
Asynchronous replication
Txn 1 Txn 2 Txn N
Replica DB
Archive
Asynchronous replication

Más contenido relacionado

La actualidad más candente

Linux field-update-2015
Linux field-update-2015Linux field-update-2015
Linux field-update-2015
Chris Simmonds
 

La actualidad más candente (20)

iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
OSS-DB Silver ポイント解説セミナー ~SQL編~ (PostgreSQL9.0)
OSS-DB Silver ポイント解説セミナー ~SQL編~ (PostgreSQL9.0)OSS-DB Silver ポイント解説セミナー ~SQL編~ (PostgreSQL9.0)
OSS-DB Silver ポイント解説セミナー ~SQL編~ (PostgreSQL9.0)
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティス
 
TripleO Deep Dive 1.1
TripleO Deep Dive 1.1TripleO Deep Dive 1.1
TripleO Deep Dive 1.1
 
Introduction to RCU
Introduction to RCUIntroduction to RCU
Introduction to RCU
 
2022 COSCUP - Let's speed up your PostgreSQL services!.pptx
2022 COSCUP - Let's speed up your PostgreSQL services!.pptx2022 COSCUP - Let's speed up your PostgreSQL services!.pptx
2022 COSCUP - Let's speed up your PostgreSQL services!.pptx
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
Apache Pulsarの概要と近況
Apache Pulsarの概要と近況Apache Pulsarの概要と近況
Apache Pulsarの概要と近況
 
[db tech showcase Tokyo 2014] B26: PostgreSQLを拡張してみよう by SRA OSS, Inc. 日本支社 高塚遥
[db tech showcase Tokyo 2014] B26: PostgreSQLを拡張してみよう  by SRA OSS, Inc. 日本支社 高塚遥[db tech showcase Tokyo 2014] B26: PostgreSQLを拡張してみよう  by SRA OSS, Inc. 日本支社 高塚遥
[db tech showcase Tokyo 2014] B26: PostgreSQLを拡張してみよう by SRA OSS, Inc. 日本支社 高塚遥
 
Raft
RaftRaft
Raft
 
libSQL
libSQLlibSQL
libSQL
 
Linux field-update-2015
Linux field-update-2015Linux field-update-2015
Linux field-update-2015
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocators
 
Ceph
CephCeph
Ceph
 
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
MySQL InnoDB Cluster and Group Replication in a nutshell  hands-on tutorialMySQL InnoDB Cluster and Group Replication in a nutshell  hands-on tutorial
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
 
PostgreSQL Unconference #5 ICU Collation
PostgreSQL Unconference #5 ICU CollationPostgreSQL Unconference #5 ICU Collation
PostgreSQL Unconference #5 ICU Collation
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
EmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤とEmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤と
 
大規模DCのネットワークデザイン
大規模DCのネットワークデザイン大規模DCのネットワークデザイン
大規模DCのネットワークデザイン
 

Similar a Fail-Safe Cluster for FirebirdSQL and something more

Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs Delphix
Kyle Hailey
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
Baruch Osoveskiy
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyle
Kyle Hailey
 
The Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - CloudlookThe Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - Cloudlook
gidgreen
 

Similar a Fail-Safe Cluster for FirebirdSQL and something more (20)

Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
Firebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVFirebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISV
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs Delphix
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
 
Proact ExaGrid Seminar Presentation KK 20220419.pdf
Proact ExaGrid Seminar Presentation KK 20220419.pdfProact ExaGrid Seminar Presentation KK 20220419.pdf
Proact ExaGrid Seminar Presentation KK 20220419.pdf
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Preventing and Resolving MySQL Downtime
Preventing and Resolving MySQL DowntimePreventing and Resolving MySQL Downtime
Preventing and Resolving MySQL Downtime
 
Tuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadTuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy Workload
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High Availability
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
les12.pdf
les12.pdfles12.pdf
les12.pdf
 
Collaborate instant cloning_kyle
Collaborate instant cloning_kyleCollaborate instant cloning_kyle
Collaborate instant cloning_kyle
 
A Backup Today Saves Tomorrow
A Backup Today Saves TomorrowA Backup Today Saves Tomorrow
A Backup Today Saves Tomorrow
 
Lock, Stock and Backup: Data Guaranteed
Lock, Stock and Backup: Data GuaranteedLock, Stock and Backup: Data Guaranteed
Lock, Stock and Backup: Data Guaranteed
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQL
 
The Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - CloudlookThe Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - Cloudlook
 
Resolving Firebird performance problems
Resolving Firebird performance problemsResolving Firebird performance problems
Resolving Firebird performance problems
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 

Más de Alexey Kovyazin

Firebird Anti-Corruption Approach
Firebird Anti-Corruption ApproachFirebird Anti-Corruption Approach
Firebird Anti-Corruption Approach
Alexey Kovyazin
 

Más de Alexey Kovyazin (20)

High-load performance testing: Firebird 2.5, 3.0, 4.0
High-load performance testing:  Firebird 2.5, 3.0, 4.0High-load performance testing:  Firebird 2.5, 3.0, 4.0
High-load performance testing: Firebird 2.5, 3.0, 4.0
 
Новые возможности языка SQL в Firebird 3.0
Новые возможности языка SQL в Firebird 3.0Новые возможности языка SQL в Firebird 3.0
Новые возможности языка SQL в Firebird 3.0
 
Firebird recovery tools and techniques by IBSurgeon
Firebird recovery tools and techniques by IBSurgeonFirebird recovery tools and techniques by IBSurgeon
Firebird recovery tools and techniques by IBSurgeon
 
How Firebird transactions work
How Firebird transactions workHow Firebird transactions work
How Firebird transactions work
 
Life with big Firebird databases
Life with big Firebird databasesLife with big Firebird databases
Life with big Firebird databases
 
Professional tools for Firebird optimization and maintenance from IBSurgeon
Professional tools for Firebird optimization and maintenance from IBSurgeonProfessional tools for Firebird optimization and maintenance from IBSurgeon
Professional tools for Firebird optimization and maintenance from IBSurgeon
 
Firebird migration: from Firebird 1.5 to Firebird 2.5
Firebird migration: from Firebird 1.5 to Firebird 2.5Firebird migration: from Firebird 1.5 to Firebird 2.5
Firebird migration: from Firebird 1.5 to Firebird 2.5
 
Firebird migration: from Firebird 1.5 to Firebird 2.5
Firebird migration: from Firebird 1.5 to Firebird 2.5Firebird migration: from Firebird 1.5 to Firebird 2.5
Firebird migration: from Firebird 1.5 to Firebird 2.5
 
Firebird Anti-Corruption Approach
Firebird Anti-Corruption ApproachFirebird Anti-Corruption Approach
Firebird Anti-Corruption Approach
 
Firebird's Big Databases (in English)
Firebird's Big Databases (in English)Firebird's Big Databases (in English)
Firebird's Big Databases (in English)
 
Firebird Dataguard (Russian)
Firebird Dataguard (Russian)Firebird Dataguard (Russian)
Firebird Dataguard (Russian)
 
Решения на базе СУБД Firebird в крупных компаниях и государственных учреждени...
Решения на базе СУБД Firebird в крупных компаниях и государственных учреждени...Решения на базе СУБД Firebird в крупных компаниях и государственных учреждени...
Решения на базе СУБД Firebird в крупных компаниях и государственных учреждени...
 
Firebird DataGuard - Еще раз об уверенности в завтрашнем дне
Firebird DataGuard -  Еще раз об уверенности в завтрашнем днеFirebird DataGuard -  Еще раз об уверенности в завтрашнем дне
Firebird DataGuard - Еще раз об уверенности в завтрашнем дне
 
Firebird usage promo draft
Firebird usage promo draftFirebird usage promo draft
Firebird usage promo draft
 
FBScanner: IBSurgeon's tool to solve all types of performance problems with F...
FBScanner: IBSurgeon's tool to solve all types of performance problems with F...FBScanner: IBSurgeon's tool to solve all types of performance problems with F...
FBScanner: IBSurgeon's tool to solve all types of performance problems with F...
 
Firebird 2.5 - вектор дальнейшего развития, Dmitry Yemanov, (in Russian)
Firebird 2.5 - вектор дальнейшего развития, Dmitry Yemanov, (in Russian)Firebird 2.5 - вектор дальнейшего развития, Dmitry Yemanov, (in Russian)
Firebird 2.5 - вектор дальнейшего развития, Dmitry Yemanov, (in Russian)
 
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
 
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
 
СУБД Firebird: Краткий обзор, Дмитрий Еманов (in Russian)
СУБД Firebird: Краткий обзор, Дмитрий Еманов (in Russian)СУБД Firebird: Краткий обзор, Дмитрий Еманов (in Russian)
СУБД Firebird: Краткий обзор, Дмитрий Еманов (in Russian)
 
Open Source: взгляд изнутри, Дмитрий Еманов (The Firebird Project) (in Russian)
Open Source: взгляд изнутри, Дмитрий Еманов (The Firebird Project) (in Russian)Open Source: взгляд изнутри, Дмитрий Еманов (The Firebird Project) (in Russian)
Open Source: взгляд изнутри, Дмитрий Еманов (The Firebird Project) (in Russian)
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Fail-Safe Cluster for FirebirdSQL and something more

  • 1. Fail-Safe Cluster for FirebirdSQL and something more Alex Koviazin, IBSurgeon www.ib-aid.com
  • 2. • Replication, Recovery and Optimization Firebird since 2002 • In Brazil since 2006 (Firebase) • Platinum Sponsor of Firebird Foundation • Based in Moscow, Russia www.ib-aid.com
  • 3. Agenda • In general - what is fail-safe cluster? • What we need to build fail-safe cluster? • Overview of Firebird versions with high-availability • Trigger-based and native replication as basis for fail-safe cluster • How HQbird Enterprise implements fail-safe cluster • Alternatives of fail-safe cluster: Mirror/Warm Standby • Special offer for FDD visitors
  • 4. What is a fail-safe cluster? User 1 User 2 User 3 User N User N
  • 5. What is a fail-safe cluster internally? User 1 User 2 User 3 User N User N Firebird master Firebird replica
  • 6. What is a fail-safe cluster? User 1 User 2 User 3 User N User N Firebird master Firebird replica Synchronization
  • 7. When master fails, users reconnect to new master User 1 User 2 User 3 User N User N Firebird master Firebird replica Synchronization
  • 8. For Fail-Safe Cluster we need 1. At Server Side • Synchronization of master and replica • Monitoring and switching mechanism 2. At Application Side • Reconnection cycle in case of disconnects • web is good example
  • 9. Database fail-safe cluster is NOT: 1) Automatically deployed solution • Complex setup (2-3 hours of work) 2) True scalable solution 3) Multi-master replication • Replicas are read only 4) Geographically distributed solution • Fast network connection required
  • 10. How Fail-Safe Cluster Work: Applications From the application point of view 1. End-user applications connect to the Firebird database, work as usual 2. If Firebird database becomes unavailable, end-user applications try to reconnect to the same host (correctly, no error screens) 3. End-user applications should better use caching (CachedUpdates) to avoid loss of uncommitted data
  • 11. How Fail-Safe Cluster Work: Server Firebird master Monitoring Firebird replica Monitoring 1.All nodes monitor each other: replica watches for master 2.If master fails, replica a) make “confirmation shoot” to master b) Changes itself as master – script + DNS
  • 12. After replica fails, there is no cluster anymore! Just a single database!
  • 13. Minimal fail-safe cluster Firebird master Monitoring Firebird replica 1 Monitoring Firebird replica 2 Monitoring
  • 14. Minimal fail-safe cluster Firebird master Monitoring Firebird master 2 Monitoring Firebird replica 2 Monitoring
  • 15. What is fail-safe cluster good for? • For web-applications • Web caches data intensively • Tolerant to reconnects • Many small queries • For complex database systems with administrator support
  • 16. HOW TO CREATE FAIL-SAFE CLUSTER IN FIREBIRD
  • 17. 2 main things to implement fail-safe cluster in Firebird 1. Synchronization between master and replica 2. Monitoring and switching to the new server
  • 18. Synchronization Native • Suitable for high-load • Easy setup • Strict master-slave, no conflicts • DDL replication • Very low delay in synchronization (seconds) Trigger-based • <200 users • Difficult setup • Mixed mode, conflicts are possible • No DDL replication • Delay is 1-2 minutes
  • 19. Comparison of Firebird versions with replication HQbird RedDatabase Avalerion Firebird Status (July 16) Production Production Beta Alpha Supported Firebird version 2.5, 30 2.5 3.0 4.0 ODS 11.2, 12.0 11.3 ? 13.0 Compatibility with Firebird 100% Backup - Restore ? 100% Synchronous Yes Yes No Yes Asynchronous Yes Yes Yes Yes Price per master+replica R$1990 R$1100 (FDD only!) ~US$3500 ? Free (in 2018) http://ib-aid.com/fdd2016
  • 20. Native replication in HQbird Firebird master Firebird replica 1. No Backup/restore needed 2. Sync and Async options 3. Easy setup (configuration only) Sync: ~1-3 seconds Async: 1 minute
  • 21. Steps to setup replication SYNCHRONOUS 1. Stop Firebird 2. Copy database to replica(s) 3. Configure replication at master 4. Configure replication at replica 5. Start Firebird at master and replica ASYNC 1. Stop Firebird 2. Copy database to the same computer 3. Configure replication at master 4. Start Firebird 5. Copy and configure at replica(s)
  • 22. Configuration for replication Replication.conf Sync <database c:datatest2.fdb> replica_database sysdba:masterkey@replicaserver:/data/testrepl2.fdb </database> Async <database C:dbwmaster.fdb> log_directory d:dblogs log_archive_directory D:dbarchlogs log_archive_command "copy $(logpathname) $(archpathname)" </database>
  • 23. Downtime to setup replication • Synchronous • Time to copy database between servers + Time to setup • Asynchronous • Time to copy database at the same server • Suitable for geographically distributed systems ASYNC is much faster to setup!
  • 24. Monitoring and switching from master to replica • HQbird (Windows, Linux) • Pacemaker (Linux only) • Other tools (self-developed)
  • 25. Elections and Impeachment • Not about presidents elections  • Replicas must select new master after old master failure Priority list MASTER1 REPLICA1 REPLICA2 REPLICA3 REPLICA4
  • 26. Algorithm 1. We have priority list: master, replica1, replica 2 2. Each node watches for all other nodes 3. If master node fails, the first available replica takes over the 1. Disables original master – stop Firebird server 2. Changes DNS of replica to master 3. Changes other servers configurations 4. Restart Firebird server at new master to changes take effect 5. Send notifications
  • 27. After failure – re-initialization • Downtime needed! • Stop all nodes • Copy database to the new/reinitialized node • Reconfigure replication configuration • Start Firebird at all nodes Too complex?
  • 28. Fail-Safe Cluster is a custom solution • HQbird gives options, you decide how to implement them • Requires careful planning and implementation • Requires high-quality infrastructure • Requires 2+ replicas • Network outage in 10 seconds will cause all replicas to degrade! • High-speed network and high speed drives
  • 29. Do we really need cluster and high availability?!
  • 30. Why we REALLY need high-availability • If database size is big (20-100Gb) • Automatic recovery with FirstAID is very slow (downtime > 8 hours) or impossible (>100Gb) • Manual recovery is expensive (US$1000+) and slow (1 day+) • Big database = Big company • Losses are not tolerated • Downtime is very expensive ($$$)
  • 31. Mirror (warm-standby) – easy to setup, reliable mirror (one or more) Firebird master Monitoring Firebird replica Monitoring Replica segment 1 Replica segment 2 Replica segment N Based on async replication
  • 32. Can be geographically distributed (FTP, Amazon, cloud backup) Firebird master Monitoring Firebird replica Monitoring Replica segment 1 Replica segment 2 Replica segment N
  • 33. Licenses HQbird Enterprise licenses needed for FAIL-SAFE CLUSTER Master+replica + N*replicas*0.5 For minimal cluster = 1 master and 2 replicas: 1 HQbird Enterprise + 0.5*HQbird Enterprise MIRROR (Warm-Standby) = Master+replica 1 HQbird Enterprise
  • 35. Solution Changes are buffered per transaction, transferred in batches, synchronized at commit Follows the master priority of locking Replication errors can either interrupt operations or just detach replica Replica is available for read-only queries (with caveats) Instant takeover (Heartbeat, Pacemaker) • Issues Additional CPU and I/O load on the slave side Replica cannot be recreated online Synchronous replication
  • 36. Txn 1 Txn 2 Txn N Buffer 1 Txn 1 Txn 2 Txn N Master DB Replica DB Buffer 2 Buffer N Synchronous replication
  • 37. Solution Changes are synchronously journalled on the master side Journal better be placed on a separate disk Journal consists of multiple segments (files) that are rotated Filled segments are transfered to the slave and applied to the replica in background Replica can be recreated online Issues Replica always lags behind under load Takeover is not immediate Read-only queries work with historical data Asynchronous replication
  • 38. Txn 1 Txn 2 Txn N Buffer 1 Master DB Buffer 2 Buffer N Journal Archive Asynchronous replication
  • 39. Txn 1 Txn 2 Txn N Replica DB Archive Asynchronous replication