M|18 Analyzing Data with the MariaDB AX Platform

What’s New in the
MariaDB AX Platform
Dipti Joshi
Director Product Management

MariaDB AX
Analytics made easy –
simple, fast, scalable…
and open source

MariaDB AX
MariaDB Server
MariaDB MaxScale
MariaDB ColumnStore
Parallel queries
Distributed storage
No indexes
Automatic partitioning
Read optimized
High compression
Low disk IO
ColumnStore
PM
ColumnStore
PM
ColumnStore
PM
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
MariaDB MaxScale
MariaDB Server
ColumnStore UM
ColumnStore
PM
MariaDB MaxScale
Distributed Shared Nothing Storage

MariaDB AX
What was there
MariaDB ColumnStore 1.0
Manual import
Manual backup/restore
Window functions
Aggregate functions
User-defined functions
Cross-engine joins
ColumnStore PMMariaDB Server
ColumnStore
UM
InnoDB
Applications / Spark
MariaDB MaxScale

Goals for next MariaDB AX
1. Expand high availability/disaster recovery options
2. Make it easier to perform custom, complex analytics
3. Streamline and simplify the process of ingesting data

MariaDB AX
What’s new
MariaDB ColumnStore 1.1
Streaming data adapters
Bulk data adapters
User defined
Window functions
Distributed aggregates
Spark support
Read : JDBC
Publish: data adapters
High availability
Local storage (GlusterFS)
Parallel backup/restore
ColumnStore PMMariaDB Server
ColumnStore
UM
InnoDB
Applications / Spark
MariaDB MaxScale

What’s new in MariaDB AX
BI CERTIFICATION
INGESTION
ANALYTICS
Applications, Apache Kafka, MariaDB MaxScale
User-defined aggregate and window functions
HA / DR GlusterFS support, Parallel backup/restore
DATA TYPES Text, BLOB columns
SECURITY Auditing
Tableau

Extend high availability
and disaster recovery options

GlusterFS Volume
Replication
High availability for
Local Storage
GlusterFS can replicate files
within a volume - HA without
the need for an SAN
ColumnStore storage nodes can
read other files within a volume
- simple, automatic failover
GlusterFS Volume
Replication
ColumnStore
PM 1
(dbroot1)
ColumnStore
PM 2
(dbroot2)
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
ColumnStore
PM 3
(dbroot3)
/dbroot
1
/dbroot
2
/dbroot
2
/dbroot
3
/dbroot
3
/dbroot
1

Parallel Backup/Restore
Parallel backup/restore using
rsync - faster backup and
restore
Support incremental backup and
restore - faster backup and
restore
Consolidate data from multiple
storage nodes in a single backup
location - simplified,
automatic backups and
restores
/home/user/columnstoreBackupData/pm1dbroot1
ColumnStore
PM 1
ColumnStore
PM 2
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
ColumnStore
PM 3
Backup and restore tool
rsync
/data1/*
rsync
/data2/*
rsync
/data3/*

Make it easier to perform
custom, complex analytics

User-defined distributed
aggregate and window
functions
User-defined distributed
aggregate functions - custom
analytical functions and better
performance
User-defined window functions
Example: calculate a weighted
sum (revenue)
$1-10 (0.5)
$11-100 (1.0)
$100+ (1.5)
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
ColumnStore
PM
ColumnStore
PM
ColumnStore
PM
$10 $5
$100 $100
$200 $300
Column WSUM
$4 $2
$8 $4
$20 $20
Column WSUM
$12 $6
$60 $60
$300 $450
Column WSUM
WSUM = $405 WSUM = $26 WSUM = $516
WSUM = $947

Streamline and simplify
the process of data ingestion

Motivation
Organizations need to make data available for analysis as
soon as it arrives
Machine learning results need to be stored where other
business/data analysts work with them
Time to insight and time to action are now competitive
differentiators for businesses

Bulk data adapters
Applications can use bulk data
adapters SDK to collect and write data
- on-demand data loading
No need to copy CSV to UM or
PM - simpler
Bypass SQL interface, parser and
optimizer - faster writes
C++
Python
Java
MariaDB Server
ColumnStore UM
Application
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Bulk Data Adapter
1. For each row
a. For each column
bulkInsert->setColumn
a. bulkInsert->writeRow
1. bulkInsert->commit
* Buffer 100,000 rows by default
Deep dive session: Ingesting Data with the New Bulk Data Adapters Today at 5 pm

– MaxScale CDC
Stream writes from MariaDB TX to
MariaDB AX
automatically and continuously
- ensure analytical data is up
to date and not stale, no
need for batch jobs, manual
processes or human
intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore UM
MariaDB MaxScale
MariaDB Server
ColumnStore UM
Streaming Data
Adapter
(CDC Client)
Binlog-Avro CDC
Router
Deep dive session: Real-time Analytics With The New Streaming Data Adapters
Tomorrow at 8:40 am

– Apache Kafka
Stream all messages published to
Apache Kafka topics to MariaDB AX
automatically and continuously
- enable data from many
sources to be streamed and
collected for analysis without
complex code
MariaDB Server
ColumnStore UM
MariaDB Server
ColumnStore UM
Streaming Data
Adapter
(Kafka Client)
Apache Kafka
Topic Topic Topic
Deep dive session: Real-time Analytics With The New Streaming Data Adapters
Tomorrow at 8:40 am

The big picture – putting it all together

AnalyticsOperations Ingestion
Apache Kafka
Streaming Data Adapters
Data Services
Bulk Data Adapters
Spark / Python / ML
Bulk Data Adapters
Transaction (OLTP)
MariaDB Server
InnoDB
MariaDB MaxScale
Web/Mobile Services
MariaDB MaxScale
Analytics (OLAP)
MariaDB
ColumnStore

Resources
Reach me
Download
Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/
Blogs https://mariadb.com/blog-tags/columnstore
https://mariadb.com/blog-tags/big-data
dipti.joshi@mariadb.com
MariaDB ColumnStore 1.1 https://mariadb.com/downloads/mariadb-ax
MariaDB MaxScale https://mariadb.com/downloads/mariadb-ax/maxscale
Bulk Data Adapters and Streaming Data Adapters
https://mariadb.com/downloads/mariadb-ax/data-adapters
MariaDB ColumnStore Backup/Restore Tool
https://mariadb.com/downloads/mariadb-ax/tools-ax

Complex, custom analytics
User-defined aggregate functions
User-defined window functions
Text and binary columns
Spark integration
JDBC (SQL)
Direct (data adapter)
Improved HA/DR
GlusterFS support
Parallel backup/restore
Streamlined data ingestion
Bulk data adapters
What’s new in MariaDB AX Summary

M|18 Analyzing Data with the MariaDB AX Platform

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a M|18 Analyzing Data with the MariaDB AX Platform

Similar a M|18 Analyzing Data with the MariaDB AX Platform (20)

Más de MariaDB plc

Más de MariaDB plc (18)

Último

Último (20)

M|18 Analyzing Data with the MariaDB AX Platform