3. MariaDB AX
MariaDB Server
MariaDB MaxScale
MariaDB ColumnStore
Parallel queries
Distributed storage
No indexes
Automatic partitioning
Read optimized
High compression
Low disk IO
ColumnStore
PM
ColumnStore
PM
ColumnStore
PM
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
MariaDB MaxScale
MariaDB Server
ColumnStore UM
ColumnStore
PM
MariaDB MaxScale
Distributed Shared Nothing Storage
4. MariaDB AX
What was there
MariaDB ColumnStore 1.0
Manual import
Manual backup/restore
Window functions
Aggregate functions
User-defined functions
Cross-engine joins
ColumnStore PMMariaDB Server
ColumnStore
UM
InnoDB
Applications / Spark
MariaDB MaxScale
5. Goals for next MariaDB AX
1. Expand high availability/disaster recovery options
2. Make it easier to perform custom, complex analytics
3. Streamline and simplify the process of ingesting data
6. MariaDB AX
What’s new
MariaDB ColumnStore 1.1
Streaming data adapters
Bulk data adapters
User defined
Window functions
Distributed aggregates
Spark support
Read : JDBC
Publish: data adapters
High availability
Local storage (GlusterFS)
Parallel backup/restore
ColumnStore PMMariaDB Server
ColumnStore
UM
InnoDB
Applications / Spark
MariaDB MaxScale
7. What’s new in MariaDB AX
BI CERTIFICATION
INGESTION
ANALYTICS
Applications, Apache Kafka, MariaDB MaxScale
User-defined aggregate and window functions
HA / DR GlusterFS support, Parallel backup/restore
DATA TYPES Text, BLOB columns
SECURITY Auditing
Tableau
9. GlusterFS Volume
Replication
High availability for
Local Storage
GlusterFS can replicate files
within a volume - HA without
the need for an SAN
ColumnStore storage nodes can
read other files within a volume
- simple, automatic failover
GlusterFS Volume
Replication
ColumnStore
PM 1
(dbroot1)
ColumnStore
PM 2
(dbroot2)
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
ColumnStore
PM 3
(dbroot3)
/dbroot
1
/dbroot
2
/dbroot
2
/dbroot
3
/dbroot
3
/dbroot
1
10. Parallel Backup/Restore
Parallel backup/restore using
rsync - faster backup and
restore
Support incremental backup and
restore - faster backup and
restore
Consolidate data from multiple
storage nodes in a single backup
location - simplified,
automatic backups and
restores
/home/user/columnstoreBackupData/pm1dbroot1
/home/user/columnstoreBackupData/pm2dbroot2
/home/user/columnstoreBackupData/pm3dbroot3
ColumnStore
PM 1
ColumnStore
PM 2
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
ColumnStore
PM 3
Backup and restore tool
rsync
/data1/*
rsync
/data2/*
rsync
/data3/*
14. Motivation
Organizations need to make data available for analysis as
soon as it arrives
Machine learning results need to be stored where other
business/data analysts work with them
Time to insight and time to action are now competitive
differentiators for businesses
15. Bulk data adapters
Applications can use bulk data
adapters SDK to collect and write data
- on-demand data loading
No need to copy CSV to UM or
PM - simpler
Bypass SQL interface, parser and
optimizer - faster writes
C++
Python
Java
MariaDB Server
ColumnStore UM
Application
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Bulk Data Adapter
1. For each row
a. For each column
bulkInsert->setColumn
a. bulkInsert->writeRow
1. bulkInsert->commit
* Buffer 100,000 rows by default
Deep dive session: Ingesting Data with the New Bulk Data Adapters Today at 5 pm
16. Streaming data adapters
– MaxScale CDC
Stream writes from MariaDB TX to
MariaDB AX
automatically and continuously
- ensure analytical data is up
to date and not stale, no
need for batch jobs, manual
processes or human
intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore UM
MariaDB MaxScale
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Streaming Data
Adapter
(CDC Client)
Binlog-Avro CDC
Router
Deep dive session: Real-time Analytics With The New Streaming Data Adapters
Tomorrow at 8:40 am
17. Streaming data adapters
– Apache Kafka
Stream all messages published to
Apache Kafka topics to MariaDB AX
automatically and continuously
- enable data from many
sources to be streamed and
collected for analysis without
complex code
MariaDB Server
ColumnStore UM
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Streaming Data
Adapter
(Kafka Client)
Apache Kafka
Topic Topic Topic
Deep dive session: Real-time Analytics With The New Streaming Data Adapters
Tomorrow at 8:40 am
19. AnalyticsOperations Ingestion
Apache Kafka
Streaming Data Adapters
Data Services
Bulk Data Adapters
Spark / Python / ML
Bulk Data Adapters
Transaction (OLTP)
MariaDB Server
InnoDB
MariaDB MaxScale
Web/Mobile Services
MariaDB MaxScale
Analytics (OLAP)
MariaDB
ColumnStore
20. Resources
Reach me
Download
Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/
Blogs https://mariadb.com/blog-tags/columnstore
https://mariadb.com/blog-tags/big-data
dipti.joshi@mariadb.com
MariaDB ColumnStore 1.1 https://mariadb.com/downloads/mariadb-ax
MariaDB MaxScale https://mariadb.com/downloads/mariadb-ax/maxscale
Bulk Data Adapters and Streaming Data Adapters
https://mariadb.com/downloads/mariadb-ax/data-adapters
MariaDB ColumnStore Backup/Restore Tool
https://mariadb.com/downloads/mariadb-ax/tools-ax
21. Complex, custom analytics
User-defined aggregate functions
User-defined window functions
Text and binary columns
Spark integration
JDBC (SQL)
Direct (data adapter)
Improved HA/DR
GlusterFS support
Parallel backup/restore
Streamlined data ingestion
Streaming data adapters
Bulk data adapters
What’s new in MariaDB AX Summary