Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
M|18 Why Abstract Away the Underlying Database Infrastructure
1. Why Abstract Away
the Underlying Database Infrastructure
MariaDB MaxScale: Database Proxy
Markus Mäkelä
2. Overview
• What is database cluster abstraction?
• Why is it important?
• How does MariaDB MaxScale do it?
3. The Idea of a Perfect Database
● Behaves like a single database
○ Simple to use
○ Easy to manage
● Performs like a cluster
○ Robust and failure tolerant
○ Near-linear scalability
What is Abstraction for Database Clusters?
The Database
5. Why is it Important?
Complexity isolation
● Simpler application development/configuration
○ No need to know where to send queries
● No user-visible infrastructure
○ Don’t need to detect servers that are in maintenance
○ No need to know the cluster topology
The Database
6. Why is it Important?
Highly Available Database
● Prevents downtime
○ Node failure is not cluster failure
● Easier Maintenance
○ Functionality not tied to physical nodes
○ Reduced capacity, not functionality
○ Easy node replacement
Database Abstraction Layer
9. MaxScale Overview
● Modular Database Proxy
○ Only use what is needed
○ Extendable
● Content-aware
○ Understands routed traffic
● Cluster-aware
○ Active cluster monitoring
○ Understands different cluster types
10. Configuration:
Defining Services instead of Servers
● “Database as a Service”
● Decouple clients from databases
● Describe what you want instead of what
you have
○ This is a service that provides
automated, highly available
read-write splitting
Database Abstraction Layer
12. ● Classify servers
○ Up or Down?
○ Master or Slave?
○ In sync or not?
● Information used by routers
○ Masters used for writes
○ Slaves used for reads
● Detects events
○ Server went down
○ Slave is disconnected
Overview: Monitors
13. ● Detects topology
○ Builds the replication tree
● Assigns correct labels
○ Root node for writes
○ Other nodes for reads
● Detects replication lag
○ Write timestamp on master
○ Read from slave
MariaDB Monitor:
Master-Slave Monitor
Master
SlaveSlave
This is a master
This is a slave
14. ● Output of SHOW ALL SLAVES STATUS
○ Slave_IO_Running: Yes
○ Slave_SQL_Running: Yes
○ Master_Server_Id: 1234
● Number of configured slaves
● @@read_only
MariaDB Monitor:
Monitored Variables
Master
SlaveSlave
This is used to build
the replication tree
15. ● Galera Clusters
○ Synchronous Cluster
○ Globally conflict free
○ Conflicting transaction → Error on commit
● Abstracted in MaxScale
○ One “master” node
■ Prevents conflicts
○ Rest labeled as “slaves”
■ Good for scaleout
Galera Cluster Monitor
Master
MasterMaster
Use this for all writes...
…and these two for reads
17. Routing & Query Classification
How the Load Balancing is Done
18. SELECT
WHERE
id
=
1;
● Provides both abstract and detailed information
○ Read or write
■ Does the query modify the database?
○ Query components
■ Is the table `t1` used in this query?
■ What are the values for the functions in the query?
○ Query characteristics
■ Does the query have a WHERE clause?
○ State changes
■ Was the default character set changed?
■ Is there an open transaction?
Query Classifier:
The Brains of MaxScale
Read-only query
19. SELECT
WHERE
id
=
1;
Query Classifier:
Details
● Based on a modified lightweight version of SQLite
○ Extended for MariaDB 10.3 syntax
○ Removed data storage and memory allocation
● Smart classification
○ First pass
■ Lightweight parsing
■ Resolves operation and query type
○ Second pass
■ Only for full syntactic classification
■ Column ↔Function relationships
Read-only query
20. ● Read/write splitting
○ Write to master, read from slaves
○ Performance improvement for read-heavy loads
○ Prevents conflicts (Galera)
● Session state tracking & propagation
○ Consistent session state
● Failure tolerant
○ Hides slave failures
● Multiple backend connections
○ Must-have for read/write splitting
○ Speeds up node failover
ReadWriteSplit:
The Routing Muscle
21. Based on server score
● Multiple algorithms
○ Active operation count → Default
■ MIN(operations)
○ Connection count
■ MIN(connections)
○ Replication delay
■ MIN(delay)
● Manually adjustable
○ Weight each server differently
■ MIN(score * weight)
ReadWriteSplit:
Load Balancing
22. ● Consistent state for all connections
○ State modifications propagated
○ Truly abstracted reads
● State modification history
○ Node replacement
ReadWriteSplit:
Session State SET SQL_MODE=’ANSI’;
23. START TRANSACTION;
SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
COMMIT;
ReadWriteSplit:
Transactions
Transactional behavior must be kept intact
● Executed only on one node
● Statements cannot be retried on other servers
● Cannot be load balanced
Read-write transaction
24. START TRANSACTION READ ONLY;
SELECT name FROM accounts WHERE id = 1;
COMMIT;
ReadWriteSplit:
Transactions
Same as read-write except:
● Can be load balanced
● Safe even with writes
○ Server returns an error
Read-only transaction
25. SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
SELECT LAST_INSERT_ID();
SET @@character_set_client=cp850;
ReadWriteSplit: Query classification
Read
Write
Dependent Query
Session State
Different queries require different behavior
● Writes to master
● Reads to slaves
● Dependent queries to previous server
● Session state modifications to all
26. SELECT name FROM accounts WHERE id = ?;
INSERT INTO logins VALUES (‘?’);
ReadWriteSplit: Query classification
Prepared statements
Observable behavior:
● None
Behind the scenes:
● Text protocol
○ Resolve query type
○ Map text identifier to query
type
● Binary protocol
○ Resolve query type
○ Route preparation
○ Map returned identifier to
query type
28. Monitors detect failures:
● Node no longer responsive
○ Response takes too long
○ Connection broken → Cannot reestablish
● Invalid state
○ Broken replication
○ Replication is lagging
○ Out-of-sync Galera node
Monitors:
Node Failure
29. Read retry
● Hides “trivial” failures
○ SELECT statement
○ autocommit=1
○ No open transaction
● Guaranteed reply
○ Try slaves first
○ Use master as last resort
ReadWriteSplit:
Hiding Node Failures
30. ● Triggered on master failure
○ Master server down
○ Lost connection to master
● Read-only queries and transactions allowed
○ For read-heavy traffic
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Read-only Mode
31. ● Triggered on slave failure
○ Discard current slave
○ Pick a replacement
● Supplements read retry
○ Lower total connection count
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Slave Replacement
33. ● Between client and router module
○ Pre-processing
○ Analytics
○ Target hinting
● Chainable
○ Output pipes to input
● Easy to write
○ First community contribution
■ tpmfilter
Filter Overview
34. Cache:
TTL-based resultset caching
● Up to 3x read performance
● Configurable caching and storage
○ Specific users or applications
○ Matching SQL statements
○ Specific tables or databases
●
35. ● Non-transactional
○ Work on a single node
○ Fail when load balanced
● Depend on previous queries
○ Read inserted value
Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
● Not compatible with load balancing
○ Can return a result without the inserted value
● Not the “correct way” to do it
○ Legacy application → hard to modify
○ Framework →impossible to modify
36. ● Detects data modification
○ Writes “pin” the session to master
● Tags the query with a hint
○ Route to master
● Configurable
○ Number of queries
○ Time interval
CCRFilter:
Consistent Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
Route this to the master!
37. ● Match-replace functionality
○ PCRE2 regular expressions
● Fix broken SQL
○ “Patching” after release
● Allows neat tricks
○ Append a LIMIT clause
○ Add optimizer hints
○ Change storage engine
Regexfilter:
sed for SQL
38. Solution:
Use the right tool. Work smart, not hard.
Wrapping Up
Problem:
Database clusters are essential for
performance and HA but are also hard to
use properly.
39. Wrapping Up
MaxScale:
A Toolbox for the Database.
● Abstracts database clusters into services
● Truly understands traffic and environment
● Makes database clusters easy to use efficiently