SlideShare a Scribd company logo
1 of 37
Download to read offline
MariaDB AX on Containers
Thomas Boyd
Solutions Architect
Getting Started with Analytics the Easy Way
Agenda
• MariaDB AX
• Kubernetes
• MariaDB AX on Kubernetes
• Q & A
MariaDB AX
Analytics for the
Agile Business
MariaDB AX
• GPLv2 Open Source
• Columnar, Massively Parallel
MariaDB Storage Engine
• Scalable, high-performance
analytics platform
• Built in redundancy and
high availability
• Runs on premise, on AWS cloud
• Full SQL syntax and capabilities
regardless of platform
Big Data Sources Analytics Insight
MariaDB ColumnStore
. . .
Node 1 Node 2 Node 3 Node N
Local / AWS®
/ GlusterFS
®
ELT
Tool
s
BI
Tool
s
MariaDB AX Architecture
Columnar Distributed Data Storage
User Connections
User Module
n
User Module
1
Performance
Module n
Performance
Module 2
Performance
Module 1
MariaDB
Front End
Query Engine
User Module
Processes SQL Requests
Performance Module
Distributed Processing Engine
MAX RANK
MIN DENSE_RANK
COUNT PERCENT_RANK
SUM NTH_VALUE
AVG FIRST_VALUE
VARIANCE LAST_VALUE
VAR_POP CUME_DIST
VAR_SAMP LAG
STD LEAD
STDDEV NTILE
STDDEV_POP PERCENTILE_CON
T
STDDEV_SAMP PERCENTILE_DISC
ROW_NUMBER MEDIAN
• Aggregate over a series of related rows
• Simplified function for complex statistical
analytics over sliding window per row
- Cumulative, moving or centered aggregates
- Simple Statistical functions like rank, max, min,
average, median
- More complex functions such as distribution,
percentile, lag, lead
- Without running complex sub-queries
Windowing Functions
Source : InfiniDB SQL Syntax Guide
Data
Exportand
Data
Im
port
Bulk Data Load
cpimport, LOAD DATA INFILE
Bulk Data Export
mysql client, odbc, jdbc
Integration with MariaDB
ColumnStore cpimport and sql
interface
MariaDB AX
High performance columnar storage engine that support wide variety of
analytical use cases with SQL in a highly scalable distributed environments
Parallel query
processing for
distributed
environments
Faster, More
Efficient Queries
Single SQL
Interface for OLTP
and analytics
Easier Enterprise
Analytics
Power of SQL and
Freedom of Open
Source to Big Data
Analytics
Better Price
Performance
Industry Category Use Case
Gaming Behavior Analytics Projecting and predicting user behavior based on past and current data
Advertising Customer Analytics Customer behavior data for market segmentation and predictive analytics.
Advertising Loyalty Analytics Customer analytics focusing on a person’s commitment to a product, company, or brand.
Web,
E-commerce
Click Stream Analytics
Web activity analysis, software testing, market research with analytics on data about the clicks areas of web pages while
web browsing [Deal News]
Marketing Promotional Testing Using marketing and campaign management data to identify the best criteria to be used for a particular marketing offer.
Social Network Network Analytics Relationship analytics among network nodes
Financial Fraud Analytics
Monitoring user financial transactions and identifying patterns of behaviour to predict and detect abnormal or
fraudulent activity to prevent damage to user and institution.
Healthcare Patient Analytics Analyzing patient medical records to identify patterns to be used for improved medical treatment.
Healthcare Clinical Analytics Analyzing clinical data and its impact on patients to identify patterns to be used for improved medical treatment.
Telco
Network and Application
Performance Analytics
Streaming data from network devices and applications enriched with business operations data to uncover actionable
insights for network planning, operations and marketing analytics
Aviation Flight analytics
Proactively project parts replacement, maintenance and air-plane retirement based on real-time and historically
collected flight parameter data [Boeing]
Customer Use Cases
Kubernetes
Container
orchestration
moving mainstream
But First: What do Containers give me?
Encapsulation of Dependencies
• O/S packages & Patches
• Execution environment (e.g. Python 2.7)
• Application Code & Dependencies
Process Isolation
• Isolate the process from anything else running
Faster, Lightweight virtualization
Virtual Machines vs. Containers
App 1 App 2 App 3
Bins/Libs Bins/Libs Bins/Libs
Guest OS Guest OS Guest OS
Hypervisor
Host Operating System
Infrastructure
Docker Engine
Operating System
Infrastructure
App 1 App 2 App 3
Bins/Libs Bins/Libs Bins/Libs
What about orchestration and Management?
Orchestration and Management of Containers and higher-level
constructs (services, deployments, etc….) is evolving
Amazon
ECS
Google Container
Engine
Azure Container
Service
Brilliant for Stateless Components
Source: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
Brilliant for Stateless Components
Brilliant for Stateless Components
Brilliant for Stateless Components
Brilliant for Stateless Components
Containers + Distributed Database = Challenges
• Data Durability
– Ephemeral container storage
• Cluster Formation
– Configuration and Coordination of User Modules and Performance
Modules
• Cluster Maintenance & Changes
– Planned and unplanned node failures
– Scale-up and scale-down
• Application Connections
– Application tier (analytics tools) should not need to change
connection information when DB topology changes
MariaDB AX +
Kubernetes
Getting started in
Dev and Test
Environments
Monthly AWS Bill Under-utilized Laptop
Motivation
● How to Test Complex, Scale-out Deployments
Aspirational
Pracitcal
100% “Cloud Native”
“Desire for Kubernetes
to enable low-friction
porting of apps from
VMs to containers”
source: https://kubernetes.io/docs/concepts/
cluster-administration/networking/
Minikube for Single Node Kubernetes
https://kubernetes.io/docs/getting-started-guides/minikube/
What kind of Experience is this going to be?
Minikube & Kubernetes Tips
• Do the tutorials (https://kubernetes.io/docs/tutorials/)
• Read (and re-read) the Documentation
• Visual queues (terminal colors) for layers
• Hypervisor selection
• VM location and size
• Recent version of Kubernetes (--kubernetes-version=v1.8.5
--bootstrapper kubeadm)
• setting DOCKER env (eval $(minikube docker-env)
What about orchestration and Management?
Orchestration and Management of Containers and higher-level
constructs (services, deployments, etc….) is evolving
Amazon
ECS
Google Container
Engine
Azure Container
Service
docker
kubelet
docker
kubelet
docker
kubelet
node
node
node
kubernetes
master(s)
kubectl
REST
Kubernetes Components
kubernetes
interfaces
volumes
containers
volumes
containers
volumes
containers
mysql:
3306
mysql:
3306
port:
3306
Kubernetes Objects
pods
services
spec:
controllers
mysql:
3306
spec:
deployments
provide access to
manage
manage
Mimicking Virtual Machines
“If there exists a headless service in the same namespace as the pod and with the
same name as the subdomain, the cluster’s KubeDNS Server also returns an A record
for the Pod’s fully qualified hostname.”
Source: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
Also: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
• Leverage KubeDNS Server for easy naming
• Host sshd daemon on each container
• Shared key for ssh as kubernetes secret
• Utilize StatefulSets
https://github.com/WonkyWumpus/easy-sshd-ubuntu-1604
SSH with Ease
tboyd$ kubectl describe pod m01|grep IP
IP: 172.17.0.7
tboyd$ minikube ssh
$ ssh -i .ssh/easy-key root@172.17.0.7
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.9.13 x86_64)
root@m01:~# ssh root@m02
root@m02:~#
MariaDB AX with StatefulSets
• Builds on the easy-sshd github
• MariaDB AX software staged to images
• Manually run standard install process to create MariaDB AX Cluster
https://github.com/WonkyWumpus/mdb-cs-easy-sshd-ubuntu-1604
MariaDB AX Prereqs and Staging Software
MariaDB AX: UM & PM StatefulSets, UM Service
MariaDB AX: Standard Install & Config
• Beware: execute install from pm-0 container!
• Install .debs
• Run /usr/local/mariadb/columnstore/bin/postConfigure
• Access cluster through UM Service
https://mariadb.com/kb/en/library/installing-and-configuring-a-multi-
server-columnstore-system-11x/
MariaDB AX + Kubernetes: Possible Future Directions
• Leverage persistent volumes and persistent volume claims
• Cluster formation and config moved into docker images
– MariaDB AX running and waiting to join cluster
– Intelligent entrypoint script for automatic cluster join
• User Module Tier automatic scaling
• Performance Module Tier automatic scaling
• Logic to tie DB Roots and Persistent External Storage
– 24 DB Roots: Instantaneously burst from 1 to 24 PM nodes!
Resources
• Kubernetes Documentation
• MariaDB ColumnStore Documentation
• MariaDB AX Datasheet
• IHME Customer Story
• What’s New in MariaDB AX
• 5 Simple Steps to get Started with MariaDB and Tableau
• Extract more Value with MariaDB ColumnStore Analytics
Questions?

More Related Content

More from MariaDB plc

MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB plc
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB plc
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023MariaDB plc
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBMariaDB plc
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerMariaDB plc
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®MariaDB plc
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysisMariaDB plc
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoringMariaDB plc
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorMariaDB plc
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB plc
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBMariaDB plc
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQLMariaDB plc
 
What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1MariaDB plc
 
What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2MariaDB plc
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLMariaDB plc
 
What’s new in Galera 4
What’s new in Galera 4What’s new in Galera 4
What’s new in Galera 4MariaDB plc
 
Beyond the basics: advanced SQL with MariaDB
Beyond the basics: advanced SQL with MariaDBBeyond the basics: advanced SQL with MariaDB
Beyond the basics: advanced SQL with MariaDBMariaDB plc
 
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Inside CynosDB: MariaDB optimized for the cloud at TencentInside CynosDB: MariaDB optimized for the cloud at Tencent
Inside CynosDB: MariaDB optimized for the cloud at TencentMariaDB plc
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
 
How THINQ runs both transactions and analytics at scale
How THINQ runs both transactions and analytics at scaleHow THINQ runs both transactions and analytics at scale
How THINQ runs both transactions and analytics at scaleMariaDB plc
 

More from MariaDB plc (20)

MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-BackupMariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023Einführung : MariaDB Tech und Business Update Hamburg 2023
Einführung : MariaDB Tech und Business Update Hamburg 2023
 
Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDB
 
Die Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise ServerDie Neuheiten in MariaDB Enterprise Server
Die Neuheiten in MariaDB Enterprise Server
 
Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®Global Data Replication with Galera for Ansell Guardian®
Global Data Replication with Galera for Ansell Guardian®
 
Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysis
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoring
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connector
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introduction
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQL
 
What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1
 
What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQL
 
What’s new in Galera 4
What’s new in Galera 4What’s new in Galera 4
What’s new in Galera 4
 
Beyond the basics: advanced SQL with MariaDB
Beyond the basics: advanced SQL with MariaDBBeyond the basics: advanced SQL with MariaDB
Beyond the basics: advanced SQL with MariaDB
 
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Inside CynosDB: MariaDB optimized for the cloud at TencentInside CynosDB: MariaDB optimized for the cloud at Tencent
Inside CynosDB: MariaDB optimized for the cloud at Tencent
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at Facebook
 
How THINQ runs both transactions and analytics at scale
How THINQ runs both transactions and analytics at scaleHow THINQ runs both transactions and analytics at scale
How THINQ runs both transactions and analytics at scale
 

Recently uploaded

Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 

Recently uploaded (20)

Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 

M|18 Getting Started with Analytics: MariaDB AX + Kubernetes

  • 1. MariaDB AX on Containers Thomas Boyd Solutions Architect Getting Started with Analytics the Easy Way
  • 2. Agenda • MariaDB AX • Kubernetes • MariaDB AX on Kubernetes • Q & A
  • 3. MariaDB AX Analytics for the Agile Business
  • 4. MariaDB AX • GPLv2 Open Source • Columnar, Massively Parallel MariaDB Storage Engine • Scalable, high-performance analytics platform • Built in redundancy and high availability • Runs on premise, on AWS cloud • Full SQL syntax and capabilities regardless of platform Big Data Sources Analytics Insight MariaDB ColumnStore . . . Node 1 Node 2 Node 3 Node N Local / AWS® / GlusterFS ® ELT Tool s BI Tool s
  • 5. MariaDB AX Architecture Columnar Distributed Data Storage User Connections User Module n User Module 1 Performance Module n Performance Module 2 Performance Module 1 MariaDB Front End Query Engine User Module Processes SQL Requests Performance Module Distributed Processing Engine
  • 6. MAX RANK MIN DENSE_RANK COUNT PERCENT_RANK SUM NTH_VALUE AVG FIRST_VALUE VARIANCE LAST_VALUE VAR_POP CUME_DIST VAR_SAMP LAG STD LEAD STDDEV NTILE STDDEV_POP PERCENTILE_CON T STDDEV_SAMP PERCENTILE_DISC ROW_NUMBER MEDIAN • Aggregate over a series of related rows • Simplified function for complex statistical analytics over sliding window per row - Cumulative, moving or centered aggregates - Simple Statistical functions like rank, max, min, average, median - More complex functions such as distribution, percentile, lag, lead - Without running complex sub-queries Windowing Functions Source : InfiniDB SQL Syntax Guide
  • 7. Data Exportand Data Im port Bulk Data Load cpimport, LOAD DATA INFILE Bulk Data Export mysql client, odbc, jdbc Integration with MariaDB ColumnStore cpimport and sql interface
  • 8. MariaDB AX High performance columnar storage engine that support wide variety of analytical use cases with SQL in a highly scalable distributed environments Parallel query processing for distributed environments Faster, More Efficient Queries Single SQL Interface for OLTP and analytics Easier Enterprise Analytics Power of SQL and Freedom of Open Source to Big Data Analytics Better Price Performance
  • 9. Industry Category Use Case Gaming Behavior Analytics Projecting and predicting user behavior based on past and current data Advertising Customer Analytics Customer behavior data for market segmentation and predictive analytics. Advertising Loyalty Analytics Customer analytics focusing on a person’s commitment to a product, company, or brand. Web, E-commerce Click Stream Analytics Web activity analysis, software testing, market research with analytics on data about the clicks areas of web pages while web browsing [Deal News] Marketing Promotional Testing Using marketing and campaign management data to identify the best criteria to be used for a particular marketing offer. Social Network Network Analytics Relationship analytics among network nodes Financial Fraud Analytics Monitoring user financial transactions and identifying patterns of behaviour to predict and detect abnormal or fraudulent activity to prevent damage to user and institution. Healthcare Patient Analytics Analyzing patient medical records to identify patterns to be used for improved medical treatment. Healthcare Clinical Analytics Analyzing clinical data and its impact on patients to identify patterns to be used for improved medical treatment. Telco Network and Application Performance Analytics Streaming data from network devices and applications enriched with business operations data to uncover actionable insights for network planning, operations and marketing analytics Aviation Flight analytics Proactively project parts replacement, maintenance and air-plane retirement based on real-time and historically collected flight parameter data [Boeing] Customer Use Cases
  • 11. But First: What do Containers give me? Encapsulation of Dependencies • O/S packages & Patches • Execution environment (e.g. Python 2.7) • Application Code & Dependencies Process Isolation • Isolate the process from anything else running Faster, Lightweight virtualization
  • 12. Virtual Machines vs. Containers App 1 App 2 App 3 Bins/Libs Bins/Libs Bins/Libs Guest OS Guest OS Guest OS Hypervisor Host Operating System Infrastructure Docker Engine Operating System Infrastructure App 1 App 2 App 3 Bins/Libs Bins/Libs Bins/Libs
  • 13. What about orchestration and Management? Orchestration and Management of Containers and higher-level constructs (services, deployments, etc….) is evolving Amazon ECS Google Container Engine Azure Container Service
  • 14. Brilliant for Stateless Components Source: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
  • 19. Containers + Distributed Database = Challenges • Data Durability – Ephemeral container storage • Cluster Formation – Configuration and Coordination of User Modules and Performance Modules • Cluster Maintenance & Changes – Planned and unplanned node failures – Scale-up and scale-down • Application Connections – Application tier (analytics tools) should not need to change connection information when DB topology changes
  • 20. MariaDB AX + Kubernetes Getting started in Dev and Test Environments
  • 21. Monthly AWS Bill Under-utilized Laptop Motivation ● How to Test Complex, Scale-out Deployments
  • 22. Aspirational Pracitcal 100% “Cloud Native” “Desire for Kubernetes to enable low-friction porting of apps from VMs to containers” source: https://kubernetes.io/docs/concepts/ cluster-administration/networking/
  • 23. Minikube for Single Node Kubernetes https://kubernetes.io/docs/getting-started-guides/minikube/
  • 24. What kind of Experience is this going to be?
  • 25. Minikube & Kubernetes Tips • Do the tutorials (https://kubernetes.io/docs/tutorials/) • Read (and re-read) the Documentation • Visual queues (terminal colors) for layers • Hypervisor selection • VM location and size • Recent version of Kubernetes (--kubernetes-version=v1.8.5 --bootstrapper kubeadm) • setting DOCKER env (eval $(minikube docker-env)
  • 26. What about orchestration and Management? Orchestration and Management of Containers and higher-level constructs (services, deployments, etc….) is evolving Amazon ECS Google Container Engine Azure Container Service
  • 29. Mimicking Virtual Machines “If there exists a headless service in the same namespace as the pod and with the same name as the subdomain, the cluster’s KubeDNS Server also returns an A record for the Pod’s fully qualified hostname.” Source: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/ Also: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ • Leverage KubeDNS Server for easy naming • Host sshd daemon on each container • Shared key for ssh as kubernetes secret • Utilize StatefulSets https://github.com/WonkyWumpus/easy-sshd-ubuntu-1604
  • 30. SSH with Ease tboyd$ kubectl describe pod m01|grep IP IP: 172.17.0.7 tboyd$ minikube ssh $ ssh -i .ssh/easy-key root@172.17.0.7 Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.9.13 x86_64) root@m01:~# ssh root@m02 root@m02:~#
  • 31. MariaDB AX with StatefulSets • Builds on the easy-sshd github • MariaDB AX software staged to images • Manually run standard install process to create MariaDB AX Cluster https://github.com/WonkyWumpus/mdb-cs-easy-sshd-ubuntu-1604
  • 32. MariaDB AX Prereqs and Staging Software
  • 33. MariaDB AX: UM & PM StatefulSets, UM Service
  • 34. MariaDB AX: Standard Install & Config • Beware: execute install from pm-0 container! • Install .debs • Run /usr/local/mariadb/columnstore/bin/postConfigure • Access cluster through UM Service https://mariadb.com/kb/en/library/installing-and-configuring-a-multi- server-columnstore-system-11x/
  • 35. MariaDB AX + Kubernetes: Possible Future Directions • Leverage persistent volumes and persistent volume claims • Cluster formation and config moved into docker images – MariaDB AX running and waiting to join cluster – Intelligent entrypoint script for automatic cluster join • User Module Tier automatic scaling • Performance Module Tier automatic scaling • Logic to tie DB Roots and Persistent External Storage – 24 DB Roots: Instantaneously burst from 1 to 24 PM nodes!
  • 36. Resources • Kubernetes Documentation • MariaDB ColumnStore Documentation • MariaDB AX Datasheet • IHME Customer Story • What’s New in MariaDB AX • 5 Simple Steps to get Started with MariaDB and Tableau • Extract more Value with MariaDB ColumnStore Analytics