SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
MariaDB Maxscale
Switchover, Failover and Rejoin
Wagner Bianchi
Remote DBA Team Lead @ MariaDB RDBA Team
Esa Korhonen
Software Engineer @ MariaDB Maxscale Engineering Team
Introduction to MariaDB MaxScale
● Intelligent database proxy:
○ Separates client application
from backend(s)
○ Understands authentication,
queries and backend roles
○ Typical use-cases: read-write
splitting, load-balancing
○ Many plugins: query filtering,
logging, caching
● Latest GA version: 2.2
DATABASE
SERVERS
CLIENT
Query processing stages
Filter
Client
Protocol
Protocol
Filter Filter Router
Server State
Monitor
Parser updates
monitors
uses
Backend
What is new in MariaDB-Monitor for MaxScale 2.2*
● Support for replication cluster manipulation: failover, switchover, rejoin
○ failover: replace a failed master with a slave
○ switchover: swap a slave with a live master
○ rejoin: bring a standalone server back to the cluster or redirect slaves replicating from the
wrong master
● Failover & rejoin can be set to activate automatically
● Reduces need for custom scripts or replication management tools
● Supported topologies: 1 Master, N slaves, 1-level depth
● Limited support for external masters
* Note: Renamed from previous mysqlmon
Switchover
● Controlled swap of master with a
designated slave
● Monitor user must have SUPER-privilege
● Depends on read_only to freeze cluster
○ SUPER-users bypasses this
● Waits for all slaves to catch up with
master
○ no data should be lost, but can be slow
● Configuration settings:
○ replication_user & replication_password
○ switchover_timeout
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
$./maxctrl call command mariadbmon switchover MariaDB-Monitor LocalSlave1
OK
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
Failover
● Promote a slave to take place of failed
master
● Damage has already been done, so no
need to worry about old master
● Chooses a new master based on following
criteria (in order of importance):
○ not in exclusion-list
○ has latest event in relay log
○ has processed latest event
○ has log_slave_updates on
● Configuration:
○ failover_timeout
● May lose data with failed master
○ (semi)sync replication
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴────────────────┘
$./maxctrl call command mariadbmon failover MariaDB-Monitor
OK
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
Automatic failover
● Trigger: master must be down for a
set amount of time
● Additional check by looking at slave
connections
● Configuration settings:
○ auto_failover
○ failcount & monitor_interval
○ verify_master_failure &
master_failure_timeout
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
$docker stop maxscalebackends_testing1_master1_1
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴────────────────┘
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
Rejoin
● Directs the joining to server to replicate from
the cluster master
○ redirect a slave replicating from the wrong master
○ start replication on a standalone server
● Looks at gtid:s to decide if the joining server can
replicate
● Manual/automatic mode (auto_rejoin=1)
● Typical use case: master goes down -> failover
-> old master comes back -> rejoined to cluster
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
$docker start maxscalebackends_testing1_master1_1
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
$./maxctrl call command mariadbmon rejoin MariaDB-Monitor LocalMaster1
$./maxctrl list servers
┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Slave, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │
├──────────────┼───────────┼──────┼─────────────┼─────────────────┤
│ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │
└──────────────┴───────────┴──────┴─────────────┴─────────────────┘
External master handling
DC A DC B
replicating from
DC A DC B
replicating from
Switchover details
Starting checks:
1. Cluster has 1 master and >1 slaves
2. All servers use GTID replication and cluster
GTID-domain is known
3. Requested new master has binary log on
Prepare current master:
1. SET GLOBAL read_only=1;
2. FLUSH TABLES;
3. FLUSH LOGS;
4. update GTID-info
Wait until all slaves catch up to
master:
1. MASTER_GTID_WAIT()
A
B
C
A
B
C
Stop slave replication on new
master:
1. STOP SLAVE;
2. RESET SLAVE ALL;
3. SET GLOBAL read_only=0
B
A
C
Redirect slaves & old master to
new master:
1. STOP SLAVE;
2. RESET SLAVE;
3. CHANGE MASTER TO …
4. START SLAVE;
Check that replication is working:
1. FLUSH TABLES;
2. Check that all slaves
receive new gtid
DEMO TIME!!
Maxscale 2.2 New Features
● At this point you know that, MariaDB Maxscale is able to:
○ Automatic/Manual Failover;
○ Manual Switchover;
○ Rejoin a crashed node as slave of an existing cluster;
● The previous processes relies on the new MariaDBMon monitor;
● Hidden details when implementing and/or break/fix:
○ For the switchover/failover/rejoin work, you need to have the monitor user (MariaDBMon) with
access on all the servers or, a separate user for replication_user and replication_password
with access on all the servers;
○ If the monitor user (MariaDBMon) has an encrypted password, the replication_password
should be encrypted as well, otherwise, the CHANGE MASTER TO running for the processes
won't be able to configure the replication for the new server;
Maxscale 2.2 New Features
● Failover: replacing a failed master.
● For the automatic failover, auto_failover variable should be true on monitor
configuration definition;
○ auto_failover=true, for automatic failover be activated;
● For the manual failover, auto_failover should be set to false on monitor
configuration definition;
● The master should be dead for the manual failover to work;
○ auto_failover=false, the failover can be activated manually:
● Enable and disable to auto_failover with the alter monitor command.
[root@box01 ~]# maxadmin call command mariadbmon failover replication-cluster-monitor
Maxscale 2.2 New Features
● Failover: replacing a failed master (automatic, auto_failover=true)
#: checking current configurations
[root@box01 ~]# grep auto_failover /var/lib/maxscale/maxscale.cnf.d/replication-cluster-monitor.cnf
auto_failover=true
#: shutdown the current master - check the current topology out of `maxadmin list servers` for better confirming it
[root@box02 ~]# systemctl stop mariadb.service
#: watching the actions on the log file
2018-02-10 13:51:02 error : Monitor was unable to connect to server [192.168.50.13]:3306 : "Can't connect to MySQL server on '192.168.50.13'"
2018-02-10 13:51:02 notice : [mariadbmon] Server [192.168.50.13]:3306 lost the master status.
2018-02-10 13:51:02 notice : Server changed state: box03[192.168.50.13:3306]: master_down. [Master, Running] -> [Down]
2018-02-10 13:51:02 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins.
2018-02-10 13:51:06 notice : [mariadbmon] Performing automatic failover to replace failed master 'box03'.
2018-02-10 13:51:06 notice : [mariadbmon] Promoting server 'box02' to master.
2018-02-10 13:51:06 notice : [mariadbmon] Redirecting slaves to new master.
2018-02-10 13:51:07 warning: [mariadbmon] Setting standalone master, server 'box02' is now the master.
2018-02-10 13:51:07 notice : Server changed state: box02[192.168.50.12:3306]: new_master. [Slave, Running] -> [Master, Running]
Maxscale 2.2 New Features
● Failover: replacing a failed master (manual, auto_failover=false)
#: setting auto_fauilover=false
[root@box01 ~]# maxadmin alter monitor replication-cluster-monitor auto_failover=false
#: current master is down, automatic failover deactivated
2018-02-09 23:31:01 error : Monitor was unable to connect to server [192.168.50.12]:3306:"Can't connect to MySQL server on '192.168.50.12'"
2018-02-09 23:31:01 notice : [mariadbmon] Server [192.168.50.12]:3306 lost the master status.
2018-02-09 23:31:01 notice : Server changed state: box02[192.168.50.12:3306]: master_down. [Master, Running] -> [Down]
#: manual failover executed
[root@box01 ~]# maxadmin call command mariadbmon failover replication-cluster-monitor
#: let's check the logs
2018-02-09 23:32:30 info : (17) [cli] MaxAdmin: call command "mariadbmon" "failover" "replication-cluster-monitor"
2018-02-09 23:32:30 notice : (17) [mariadbmon] Stopped monitor replication-cluster-monitor for the duration of failover.
2018-02-09 23:32:30 notice : (17) [mariadbmon] Promoting server 'box03' to master.
2018-02-09 23:32:30 notice : (17) [mariadbmon] Redirecting slaves to new master.
2018-02-09 23:32:30 notice : (17) [mariadbmon] Failover performed.
2018-02-09 23:32:30 warning: [mariadbmon] Setting standalone master, server 'box03' is now the master.
2018-02-09 23:32:30 notice : Server changed state: box03[192.168.50.13:3306]: new_master. [Slave, Running] -> [Master, Running]
Maxscale 2.2 New Features
● Failover: replacing a failed master, additional details
● The passes time is based on the monitor's monitor_interval value;
○ As it's now set as 1000ms, 1 second, the failover will be triggered after 4 seconds, considering
the first pass done when monitor reported the first message;
○ If the failover process does not complete within the time configured on failover_timeout, it is 90
secs by default, the failover is canceled and the feature is disabled;
○ To enable failover again (after checking the possible problems), use the alter monitor cmd:
2018-02-10 13:51:02 warning: [mariadbmon] Master has failed.If master status does not change in 4 monitor passes, failover begins.
[root@box01 ~]# maxadmin alter monitor replication-cluster-monitor auto_failover=true
Maxscale 2.2 New Features
● Switchover: swapping a slave with a running master.
● The switchover process relies on the replication_user and
replication_password setting added to the monitor configs;
● The process is triggered manually and it should take up to
switchover_timeout seconds to complete - default 90 seconds;
● If the process fails, the log will be written and the auto_failover will be
disabled if enabled;
[root@team01-box01 ~]# maxadmin call command mariadbmon switchover replication-cluster-monitor new_master master
Maxscale 2.2 New Features
#: checking the current server's list
[root@team01-box01 ~]# maxadmin list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
box02 | 10.132.116.147 | 3306 | 0 | Slave, Running
box03 | 10.132.116.161 | 3306 | 0 | Master, Running
-------------------+-----------------+-------+-------------+--------------------
#: new_master=box03, current_master=box02
[root@team01-box01 ~]# maxadmin call command mariadbmon switchover replication-cluster-monitor box03 box02
#: checking logs
2018-02-14 16:44:46 info : (712) [cli] MaxAdmin: call command "mariadbmon" "switchover" "replication-cluster-monitor" "box02" "box03"
2018-02-14 16:44:46 notice : (712) [mariadbmon] Stopped the monitor replication-cluster-monitor for the duration of switchover.
2018-02-14 16:44:46 notice : (712) [mariadbmon] Demoting server 'box03'.
2018-02-14 16:44:46 notice : (712) [mariadbmon] Promoting server 'box02' to master.
2018-02-14 16:44:46 notice : (712) [mariadbmon] Old master 'box03' starting replication from 'box02'.
2018-02-14 16:44:46 notice : (712) [mariadbmon] Redirecting slaves to new master.
2018-02-14 16:44:47 notice : (712) [mariadbmon] Switchover box03 -> box02 performed.
2018-02-14 16:44:47 notice : Server changed state: box02[10.132.116.147:3306]: new_master. [Slave, Running] -> [Master, Slave, Running]
2018-02-14 16:44:47 notice : Server changed state: box03[10.132.116.161:3306]: new_slave. [Master, Running] -> [Slave, Running]
2018-02-14 16:44:48 notice : Server changed state: box02[10.132.116.147:3306]: new_master. [Master, Slave, Running] -> [Master, Running]
Switchover: swapping a slave with a running master.
Maxscale 2.2 New Features
● Rejoin: joining a standalone server to the cluster.
● Enable automatic joining back of server to the cluster when a crashed
backend server gets back online;
● When auto_rejoin is enabled, the monitor will attempt to direct
standalone servers and servers replicating from a relay master to the main
cluster master server;
● Test it as we did:
○ Check what is the current master, shutdown MariaDB Server;
○ The failover will happen in case auto_failover is enabled;
○ Start the process for the shutdown MariaDB Server;
○ List servers again out of Maxadmin, watch logs.
Maxscale 2.2 New Features
● Rejoin: joining a standalone server to the cluster.
#: current_master=box02
[root@team01-box02 ~]# mysqladmin shutdown
#: watching logs, the failover will happen as the master "crashed"
2018-02-14 18:44:36 error : Monitor was unable to connect to server [10.132.116.147]:3306 : "Can't connect to MySQL server on '10.132.116.147' (115)"
2018-02-14 18:44:36 notice : [mariadbmon] Server [10.132.116.147]:3306 lost the master status.
2018-02-14 18:44:36 notice : Server changed state: box02[10.132.116.147:3306]: master_down. [Master, Running] -> [Down]
2018-02-14 18:44:36 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins.
2018-02-14 18:44:40 notice : [mariadbmon] Performing automatic failover to replace failed master 'box02'.
2018-02-14 18:44:40 notice : [mariadbmon] Promoting server 'box03' to master.
2018-02-14 18:44:40 notice : [mariadbmon] Redirecting slaves to new master.
2018-02-14 18:44:41 warning: [mariadbmon] Setting standalone master, server 'box03' is now the master.
2018-02-14 18:44:41 notice : Server changed state: box03[10.132.116.161:3306]: new_master. [Slave, Running] -> [Master, Running]
#: starting old master back
[root@team01-box02 ~]# systemctl start mariadb.service
#: watching logs
2018-02-14 18:47:27 notice : Server changed state: box02[10.132.116.147:3306]: server_up. [Down] -> [Running]
2018-02-14 18:47:27 notice : [mariadbmon] Directing standalone server 'box02' to replicate from 'box03'.
2018-02-14 18:47:27 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster.
2018-02-14 18:47:28 notice : Server changed state: box02[10.132.116.147:3306]: new_slave. [Running] -> [Slave, Running]
Thank you!
Time for questions
And answers

Más contenido relacionado

Más de MariaDB plc

MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB plc
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB plc
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB plc
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB plc
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB plc
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB plc
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB plc
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB plc
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB plc
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023MariaDB plc
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBMariaDB plc
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerMariaDB plc
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®MariaDB plc
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysisMariaDB plc
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoringMariaDB plc
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorMariaDB plc
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB plc
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBMariaDB plc
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQLMariaDB plc
 

Más de MariaDB plc (20)

MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.xMariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - NewpharmaMariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - Newpharma
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - CloudMariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - Cloud
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB EnterpriseMariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - MariaDB Enterprise
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - MaxScale
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentationMariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - novadys presentation
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentationMariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Paris Workshop 2023 - DARVA presentation
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDB
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise Server
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysis
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoring
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connector
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introduction
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQL
 

Último

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Último (20)

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

M|18 Technical Introduction: Automatic Failover

  • 1. MariaDB Maxscale Switchover, Failover and Rejoin Wagner Bianchi Remote DBA Team Lead @ MariaDB RDBA Team Esa Korhonen Software Engineer @ MariaDB Maxscale Engineering Team
  • 2. Introduction to MariaDB MaxScale ● Intelligent database proxy: ○ Separates client application from backend(s) ○ Understands authentication, queries and backend roles ○ Typical use-cases: read-write splitting, load-balancing ○ Many plugins: query filtering, logging, caching ● Latest GA version: 2.2 DATABASE SERVERS CLIENT
  • 3. Query processing stages Filter Client Protocol Protocol Filter Filter Router Server State Monitor Parser updates monitors uses Backend
  • 4. What is new in MariaDB-Monitor for MaxScale 2.2* ● Support for replication cluster manipulation: failover, switchover, rejoin ○ failover: replace a failed master with a slave ○ switchover: swap a slave with a live master ○ rejoin: bring a standalone server back to the cluster or redirect slaves replicating from the wrong master ● Failover & rejoin can be set to activate automatically ● Reduces need for custom scripts or replication management tools ● Supported topologies: 1 Master, N slaves, 1-level depth ● Limited support for external masters * Note: Renamed from previous mysqlmon
  • 5. Switchover ● Controlled swap of master with a designated slave ● Monitor user must have SUPER-privilege ● Depends on read_only to freeze cluster ○ SUPER-users bypasses this ● Waits for all slaves to catch up with master ○ no data should be lost, but can be slow ● Configuration settings: ○ replication_user & replication_password ○ switchover_timeout $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘ $./maxctrl call command mariadbmon switchover MariaDB-Monitor LocalSlave1 OK $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘
  • 6. Failover ● Promote a slave to take place of failed master ● Damage has already been done, so no need to worry about old master ● Chooses a new master based on following criteria (in order of importance): ○ not in exclusion-list ○ has latest event in relay log ○ has processed latest event ○ has log_slave_updates on ● Configuration: ○ failover_timeout ● May lose data with failed master ○ (semi)sync replication $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴────────────────┘ $./maxctrl call command mariadbmon failover MariaDB-Monitor OK $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘
  • 7. Automatic failover ● Trigger: master must be down for a set amount of time ● Additional check by looking at slave connections ● Configuration settings: ○ auto_failover ○ failcount & monitor_interval ○ verify_master_failure & master_failure_timeout $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘ $docker stop maxscalebackends_testing1_master1_1 $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴────────────────┘ $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘
  • 8. Rejoin ● Directs the joining to server to replicate from the cluster master ○ redirect a slave replicating from the wrong master ○ start replication on a standalone server ● Looks at gtid:s to decide if the joining server can replicate ● Manual/automatic mode (auto_rejoin=1) ● Typical use case: master goes down -> failover -> old master comes back -> rejoined to cluster $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Down │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘ $docker start maxscalebackends_testing1_master1_1 $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘ $./maxctrl call command mariadbmon rejoin MariaDB-Monitor LocalMaster1 $./maxctrl list servers ┌──────────────┬───────────┬──────┬─────────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalMaster1 │ 127.0.0.1 │ 3001 │ 0 │ Slave, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave1 │ 127.0.0.1 │ 3002 │ 0 │ Master, Running │ ├──────────────┼───────────┼──────┼─────────────┼─────────────────┤ │ LocalSlave2 │ 127.0.0.1 │ 3003 │ 0 │ Slave, Running │ └──────────────┴───────────┴──────┴─────────────┴─────────────────┘
  • 9. External master handling DC A DC B replicating from DC A DC B replicating from
  • 10. Switchover details Starting checks: 1. Cluster has 1 master and >1 slaves 2. All servers use GTID replication and cluster GTID-domain is known 3. Requested new master has binary log on Prepare current master: 1. SET GLOBAL read_only=1; 2. FLUSH TABLES; 3. FLUSH LOGS; 4. update GTID-info Wait until all slaves catch up to master: 1. MASTER_GTID_WAIT() A B C A B C Stop slave replication on new master: 1. STOP SLAVE; 2. RESET SLAVE ALL; 3. SET GLOBAL read_only=0 B A C Redirect slaves & old master to new master: 1. STOP SLAVE; 2. RESET SLAVE; 3. CHANGE MASTER TO … 4. START SLAVE; Check that replication is working: 1. FLUSH TABLES; 2. Check that all slaves receive new gtid
  • 12. Maxscale 2.2 New Features ● At this point you know that, MariaDB Maxscale is able to: ○ Automatic/Manual Failover; ○ Manual Switchover; ○ Rejoin a crashed node as slave of an existing cluster; ● The previous processes relies on the new MariaDBMon monitor; ● Hidden details when implementing and/or break/fix: ○ For the switchover/failover/rejoin work, you need to have the monitor user (MariaDBMon) with access on all the servers or, a separate user for replication_user and replication_password with access on all the servers; ○ If the monitor user (MariaDBMon) has an encrypted password, the replication_password should be encrypted as well, otherwise, the CHANGE MASTER TO running for the processes won't be able to configure the replication for the new server;
  • 13. Maxscale 2.2 New Features ● Failover: replacing a failed master. ● For the automatic failover, auto_failover variable should be true on monitor configuration definition; ○ auto_failover=true, for automatic failover be activated; ● For the manual failover, auto_failover should be set to false on monitor configuration definition; ● The master should be dead for the manual failover to work; ○ auto_failover=false, the failover can be activated manually: ● Enable and disable to auto_failover with the alter monitor command. [root@box01 ~]# maxadmin call command mariadbmon failover replication-cluster-monitor
  • 14. Maxscale 2.2 New Features ● Failover: replacing a failed master (automatic, auto_failover=true) #: checking current configurations [root@box01 ~]# grep auto_failover /var/lib/maxscale/maxscale.cnf.d/replication-cluster-monitor.cnf auto_failover=true #: shutdown the current master - check the current topology out of `maxadmin list servers` for better confirming it [root@box02 ~]# systemctl stop mariadb.service #: watching the actions on the log file 2018-02-10 13:51:02 error : Monitor was unable to connect to server [192.168.50.13]:3306 : "Can't connect to MySQL server on '192.168.50.13'" 2018-02-10 13:51:02 notice : [mariadbmon] Server [192.168.50.13]:3306 lost the master status. 2018-02-10 13:51:02 notice : Server changed state: box03[192.168.50.13:3306]: master_down. [Master, Running] -> [Down] 2018-02-10 13:51:02 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins. 2018-02-10 13:51:06 notice : [mariadbmon] Performing automatic failover to replace failed master 'box03'. 2018-02-10 13:51:06 notice : [mariadbmon] Promoting server 'box02' to master. 2018-02-10 13:51:06 notice : [mariadbmon] Redirecting slaves to new master. 2018-02-10 13:51:07 warning: [mariadbmon] Setting standalone master, server 'box02' is now the master. 2018-02-10 13:51:07 notice : Server changed state: box02[192.168.50.12:3306]: new_master. [Slave, Running] -> [Master, Running]
  • 15. Maxscale 2.2 New Features ● Failover: replacing a failed master (manual, auto_failover=false) #: setting auto_fauilover=false [root@box01 ~]# maxadmin alter monitor replication-cluster-monitor auto_failover=false #: current master is down, automatic failover deactivated 2018-02-09 23:31:01 error : Monitor was unable to connect to server [192.168.50.12]:3306:"Can't connect to MySQL server on '192.168.50.12'" 2018-02-09 23:31:01 notice : [mariadbmon] Server [192.168.50.12]:3306 lost the master status. 2018-02-09 23:31:01 notice : Server changed state: box02[192.168.50.12:3306]: master_down. [Master, Running] -> [Down] #: manual failover executed [root@box01 ~]# maxadmin call command mariadbmon failover replication-cluster-monitor #: let's check the logs 2018-02-09 23:32:30 info : (17) [cli] MaxAdmin: call command "mariadbmon" "failover" "replication-cluster-monitor" 2018-02-09 23:32:30 notice : (17) [mariadbmon] Stopped monitor replication-cluster-monitor for the duration of failover. 2018-02-09 23:32:30 notice : (17) [mariadbmon] Promoting server 'box03' to master. 2018-02-09 23:32:30 notice : (17) [mariadbmon] Redirecting slaves to new master. 2018-02-09 23:32:30 notice : (17) [mariadbmon] Failover performed. 2018-02-09 23:32:30 warning: [mariadbmon] Setting standalone master, server 'box03' is now the master. 2018-02-09 23:32:30 notice : Server changed state: box03[192.168.50.13:3306]: new_master. [Slave, Running] -> [Master, Running]
  • 16. Maxscale 2.2 New Features ● Failover: replacing a failed master, additional details ● The passes time is based on the monitor's monitor_interval value; ○ As it's now set as 1000ms, 1 second, the failover will be triggered after 4 seconds, considering the first pass done when monitor reported the first message; ○ If the failover process does not complete within the time configured on failover_timeout, it is 90 secs by default, the failover is canceled and the feature is disabled; ○ To enable failover again (after checking the possible problems), use the alter monitor cmd: 2018-02-10 13:51:02 warning: [mariadbmon] Master has failed.If master status does not change in 4 monitor passes, failover begins. [root@box01 ~]# maxadmin alter monitor replication-cluster-monitor auto_failover=true
  • 17. Maxscale 2.2 New Features ● Switchover: swapping a slave with a running master. ● The switchover process relies on the replication_user and replication_password setting added to the monitor configs; ● The process is triggered manually and it should take up to switchover_timeout seconds to complete - default 90 seconds; ● If the process fails, the log will be written and the auto_failover will be disabled if enabled; [root@team01-box01 ~]# maxadmin call command mariadbmon switchover replication-cluster-monitor new_master master
  • 18. Maxscale 2.2 New Features #: checking the current server's list [root@team01-box01 ~]# maxadmin list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server | Address | Port | Connections | Status -------------------+-----------------+-------+-------------+-------------------- box02 | 10.132.116.147 | 3306 | 0 | Slave, Running box03 | 10.132.116.161 | 3306 | 0 | Master, Running -------------------+-----------------+-------+-------------+-------------------- #: new_master=box03, current_master=box02 [root@team01-box01 ~]# maxadmin call command mariadbmon switchover replication-cluster-monitor box03 box02 #: checking logs 2018-02-14 16:44:46 info : (712) [cli] MaxAdmin: call command "mariadbmon" "switchover" "replication-cluster-monitor" "box02" "box03" 2018-02-14 16:44:46 notice : (712) [mariadbmon] Stopped the monitor replication-cluster-monitor for the duration of switchover. 2018-02-14 16:44:46 notice : (712) [mariadbmon] Demoting server 'box03'. 2018-02-14 16:44:46 notice : (712) [mariadbmon] Promoting server 'box02' to master. 2018-02-14 16:44:46 notice : (712) [mariadbmon] Old master 'box03' starting replication from 'box02'. 2018-02-14 16:44:46 notice : (712) [mariadbmon] Redirecting slaves to new master. 2018-02-14 16:44:47 notice : (712) [mariadbmon] Switchover box03 -> box02 performed. 2018-02-14 16:44:47 notice : Server changed state: box02[10.132.116.147:3306]: new_master. [Slave, Running] -> [Master, Slave, Running] 2018-02-14 16:44:47 notice : Server changed state: box03[10.132.116.161:3306]: new_slave. [Master, Running] -> [Slave, Running] 2018-02-14 16:44:48 notice : Server changed state: box02[10.132.116.147:3306]: new_master. [Master, Slave, Running] -> [Master, Running] Switchover: swapping a slave with a running master.
  • 19. Maxscale 2.2 New Features ● Rejoin: joining a standalone server to the cluster. ● Enable automatic joining back of server to the cluster when a crashed backend server gets back online; ● When auto_rejoin is enabled, the monitor will attempt to direct standalone servers and servers replicating from a relay master to the main cluster master server; ● Test it as we did: ○ Check what is the current master, shutdown MariaDB Server; ○ The failover will happen in case auto_failover is enabled; ○ Start the process for the shutdown MariaDB Server; ○ List servers again out of Maxadmin, watch logs.
  • 20. Maxscale 2.2 New Features ● Rejoin: joining a standalone server to the cluster. #: current_master=box02 [root@team01-box02 ~]# mysqladmin shutdown #: watching logs, the failover will happen as the master "crashed" 2018-02-14 18:44:36 error : Monitor was unable to connect to server [10.132.116.147]:3306 : "Can't connect to MySQL server on '10.132.116.147' (115)" 2018-02-14 18:44:36 notice : [mariadbmon] Server [10.132.116.147]:3306 lost the master status. 2018-02-14 18:44:36 notice : Server changed state: box02[10.132.116.147:3306]: master_down. [Master, Running] -> [Down] 2018-02-14 18:44:36 warning: [mariadbmon] Master has failed. If master status does not change in 4 monitor passes, failover begins. 2018-02-14 18:44:40 notice : [mariadbmon] Performing automatic failover to replace failed master 'box02'. 2018-02-14 18:44:40 notice : [mariadbmon] Promoting server 'box03' to master. 2018-02-14 18:44:40 notice : [mariadbmon] Redirecting slaves to new master. 2018-02-14 18:44:41 warning: [mariadbmon] Setting standalone master, server 'box03' is now the master. 2018-02-14 18:44:41 notice : Server changed state: box03[10.132.116.161:3306]: new_master. [Slave, Running] -> [Master, Running] #: starting old master back [root@team01-box02 ~]# systemctl start mariadb.service #: watching logs 2018-02-14 18:47:27 notice : Server changed state: box02[10.132.116.147:3306]: server_up. [Down] -> [Running] 2018-02-14 18:47:27 notice : [mariadbmon] Directing standalone server 'box02' to replicate from 'box03'. 2018-02-14 18:47:27 notice : [mariadbmon] 1 server(s) redirected or rejoined the cluster. 2018-02-14 18:47:28 notice : Server changed state: box02[10.132.116.147:3306]: new_slave. [Running] -> [Slave, Running]
  • 21. Thank you! Time for questions And answers