SlideShare una empresa de Scribd logo
1 de 38
Building High-Performance MySQL Query Systems and Analytic Applications Robin Schumacher
Agenda ,[object Object],[object Object],[object Object],[object Object]
What are we talking about? ,[object Object],[object Object],[object Object]
Reporting and Business Intelligence DB’s ,[object Object],[object Object],[object Object]
Data Warehouses/Marts/Analytic DB’s OLTP Files/XML Log Files Operational Source Data Staging  or ODS ETL Final  ETL Reporting, BI, Notification Layer Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Warehouse Archive Purge/Archive Data Warehouse and Metadata Management
Reporting Databases OLTP Database Read Shard One Reporting Database Application Servers End Users ETL Data Archiving Link Replication
Application Sharding / Partitioning ,[object Object],[object Object],[object Object]
Read Sharding / Partitioning
What are the core rules to follow in order to avoid anxiety over building fast read-intensive, reporting, and analytic databases?
#1 Only Read the Data You Need ,[object Object],[object Object],[object Object],[object Object]
#2 Exploit Modern Hardware ,[object Object],[object Object],[object Object],[object Object]
#3 Divide and Conquer ,[object Object],[object Object],[object Object],[object Object]
#4 Scale both I/O and User Connections ,[object Object],[object Object],[object Object]
#5 Provide Transparent Expansion and Failover ,[object Object],[object Object],[object Object]
#6 Load New Data with Minimal Impact ,[object Object],[object Object],[object Object],[object Object]
#7 Quickly Troubleshoot Poor Read Performance ,[object Object],[object Object],[object Object]
Good suggestions, but how can I practically do all these things…?
What is Calpont’s InfiniDB? InfiniDB is an open source, column-oriented database architected to handle data warehouses, data marts, analytic/BI systems, and other read-intensive applications. It delivers true scale up (more CPU’s/cores, RAM) and massive parallel processing (MPP) scale out capabilities for MySQL users. Linear performance gains are achieved when adding either more capabilities to one box or using commodity machines in a scale out configuration.  Scale up Scale Out
#1 Only Read the Data You Need ,[object Object],[object Object],[object Object],[object Object],[object Object],Recommendation : Start using a column-oriented database Caveat : if you are reading all (select *) or most of the columns in a table, then a column database may not be right for your application.
Column vs. Row Orientation  A column-oriented architecture looks the same on the surface, but stores data differently than legacy/row-based databases…
#2 Exploit Modern Hardware ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use databases/storage engines that scale up (i.e. use available CPU’s/cores)
InfiniDB Community – Scale Up InfiniDB Community edition is a FOSS, multi-threaded database server that is capable of using a machine’s CPUs/cores to process queries 87% 22.14 164.12 Q3.2 83% 55.04 316.79 Q3.1 87% 15.94 121.33 Q2.3 87% 19.70 151.20 Q2.2 79% 44.65 210.21 Q2.1 Overall Percent Reduction with additional cores InfiniDB 8 cores (elapsed time in seconds) InfiniDB 1 Core (elapsed time in seconds) SSB Query  (@100 scale)
#3 Divide and Conquer ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use Scale-Out in addition to Scale-up
InfiniDB Enterprise – Scale Up and Out User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog
#3 Divide and Conquer ,[object Object],[object Object],[object Object],87% 77.74 148.49 297.46 597.97 Q3.2 84% 134.21 316.50 425.25 848.79 Q3.1 87% 51.36 96.03 192.03 386.66 Q2.3 87% 56.41 106.37 214.87 430.25 Q2.2 87% 68.21 129.90 261.35 531.34 Q2.1 Overall Percent Reduction from 1 – 8PM’s 8PM (elapsed time in seconds) 4PM (elapsed time in seconds) 2PM (elapsed time in seconds) 1PM (elapsed time in seconds) SSB Query @1000
#4 Scale both I/O and User Connections Recommendation : Use modular architecture  User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog Add more Performance Modules to scale I/O Add more User Modules to scale concurrency
#5 Provide Transparent Expansion and Failover ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use either replication or MPP
#5 Provide Transparent Expansion and Failover Cust_id 1-999 Cust_id 1000-1999 Cust_id 2000-2999 Sharding Architecture MySQL Replication Web/App Servers Browsers
#5 Provide Transparent Expansion and Failover User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog If one Performance Module fails, traffic resumes with the remaining nodes User queries can be redirected to other User Modules if one fails
#6 Load New Data with Minimal Impact ,[object Object],[object Object],[object Object],[object Object],Recommendation : Use two-step ETL feed with non-blocking load utilities and/or MVCC database engine
#6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data Staging  or ODS ETL High-speed Load Utility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management
#7 Quickly Troubleshoot Poor Read Performance ,[object Object],[object Object],[object Object],[object Object],Recommendation : Proactively use load testing; reactively use SQL analysis and tracing
InfiniDB Extent Map – No Indexing Needed If a column WHERE filter of “COL1 BETWEEN 220 AND 250 AND COL2 < 10000” is specified, InfiniDB will eliminate extents 1, 2 and 4 from the first column filter, then, looking at just the matching extents for COL2 (i.e. just extent 3), it will determine that no extents match and return zero rows without doing any I/O at all. … Extent Map Also enables logical range partitioning of data… Ext 2 Min 101 Max 200 Ext 3 Min 201 Max 300 Ext 4 Min 301 Max 400 Col1 Ext 1 Min 1 Max 100 Ext 2 Min 10100 Max 20000 Ext 3 Min 20100 Max 30000 Ext 4 Min 30100 Max 40000 Col2 Ext 1 Min 100 Max 10000
Summary Provides both diagnostic and tracing tools; no major design tuning efforts Use load testing and SQL analysis tools Method for troubleshooting poor read performance Has high-speed loader with no blocking and MVCC Use two-step ETL and bulk load process Load data with minimal impact Does transparent failover for I/O and manual for connectivity Use replication and load balancers Provide transparent expansion and failover Modular architecture for scaling both concurrency and I/O Application partition Scale concurrency and I/O Supports MPP scale out Spread load via replication or MPP Divide and Conquer Is multi-threaded and uses multiple CPUs / Cores Use DB’s/storage engines that are multi-threaded Exploit modern hardware Is column-oriented Use column database Only read the data you need InfiniDB General Technique Recommendation
Calpont Solutions Calpont Analytic Database Server Editions Calpont Analytic Database Solutions InfiniDB  Community Server Column-Oriented Multi-threaded Terabyte Capable Single Server InfiniDB Enterprise Server Scale out / Parallel Processing Automatic Failover InfiniDB Enterprise Solution Monitoring 24x7 Support Auto Patch Management Alerts & SNMP Notifications Hot Fix Builds Consultative Help
InfiniDB Community & Enterprise Server Comparison Yes No Multi-Node, MPP scale out capable w/ failover Formal Production Support Forums Only Support Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes InfiniDB Community Yes INSERT/UPDATE/DELETE (DML) support Yes Transaction support (ACID compliant) Yes MySQL front end Yes Logical data compression Yes High-Speed bulk loader w/ no blocking queries while loading Yes Multi-threaded engine (queries/writes will use all CPU’s/cores on box) Yes Crash-recovery Yes Terabyte database capable Yes High concurrency supported Yes Alter Table with online add column capability  Yes MVCC support – snapshot read (readers don’t block writers) Yes Automatic vertical (column) and logical horizontal partitioning of data Yes No indexing necessary Yes Column-oriented InfiniDB Enterprise Core Database Server Features
For More Information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],www.infinidb.org
Building High-Performance MySQL Query Systems and Analytic Applications Thanks…!

Más contenido relacionado

La actualidad más candente

Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Continuent
 
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the fieldJoAnna Cheshire
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed DatabaseAbhilasha Lahigude
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?Deepak Shankar
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterEDB
 
Replication in Distributed Real Time Database
Replication in Distributed Real Time DatabaseReplication in Distributed Real Time Database
Replication in Distributed Real Time DatabaseGhanshyam Yadav
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK toolsHaribabu Nandyal Padmanaban
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephenSteve Feldman
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
TimesTen Overview
TimesTen OverviewTimesTen Overview
TimesTen OverviewRex Wang
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systemsTinniam V Ganesh (TV)
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementDATAVERSITY
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA TipsEDB
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Global Business Events
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 

La actualidad más candente (19)

Breaking data
Breaking dataBreaking data
Breaking data
 
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
 
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the field
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data Center
 
Replication in Distributed Real Time Database
Replication in Distributed Real Time DatabaseReplication in Distributed Real Time Database
Replication in Distributed Real Time Database
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
try
trytry
try
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
TimesTen Overview
TimesTen OverviewTimesTen Overview
TimesTen Overview
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systems
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA Tips
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 

Destacado

Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query TuningAlexander Rubin
 
Zurich2007 MySQL Query Optimization
Zurich2007 MySQL Query OptimizationZurich2007 MySQL Query Optimization
Zurich2007 MySQL Query OptimizationHiệp Lê Tuấn
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014Dave Stokes
 
56 Query Optimization
56 Query Optimization56 Query Optimization
56 Query OptimizationMYXPLAIN
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimizationBaohua Cai
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxPythian
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksMYXPLAIN
 
Tunning sql query
Tunning sql queryTunning sql query
Tunning sql queryvuhaininh88
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101Sveta Smirnova
 
Advanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema TuningAdvanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema TuningMYXPLAIN
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query OptimizationMorgan Tocker
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning晓 周
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
 
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksJaime Crespo
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index TuningManikanda kumar
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationmysqlops
 

Destacado (20)

Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query Tuning
 
Zurich2007 MySQL Query Optimization
Zurich2007 MySQL Query OptimizationZurich2007 MySQL Query Optimization
Zurich2007 MySQL Query Optimization
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
 
56 Query Optimization
56 Query Optimization56 Query Optimization
56 Query Optimization
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimization
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
Tunning sql query
Tunning sql queryTunning sql query
Tunning sql query
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Advanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema TuningAdvanced MySQL Query and Schema Tuning
Advanced MySQL Query and Schema Tuning
 
MySQL Query Optimization
MySQL Query OptimizationMySQL Query Optimization
MySQL Query Optimization
 
My sql optimization
My sql optimizationMy sql optimization
My sql optimization
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning
 
MySQL Query Optimization.
MySQL Query Optimization.MySQL Query Optimization.
MySQL Query Optimization.
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
 

Similar a Building High Performance MySql Query Systems And Analytic Applications

EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
Parallel processing in data warehousing and big data
Parallel processing in data warehousing and big dataParallel processing in data warehousing and big data
Parallel processing in data warehousing and big dataAbhishek Sharma
 
Applications of parellel computing
Applications of parellel computingApplications of parellel computing
Applications of parellel computingpbhopi
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replicationShahzad
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSijcsit
 
Veritas Failover3
Veritas Failover3Veritas Failover3
Veritas Failover3grogers1124
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfpbonillo1
 
Vargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtVargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtGenoveva Vargas-Solar
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Scalability Considerations
Scalability ConsiderationsScalability Considerations
Scalability ConsiderationsNavid Malek
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedchet justice
 
Sql interview question part 5
Sql interview question part 5Sql interview question part 5
Sql interview question part 5kaashiv1
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecasesudhakara st
 
Database project edi
Database project ediDatabase project edi
Database project ediRey Jefferson
 

Similar a Building High Performance MySql Query Systems And Analytic Applications (20)

EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
Parallel processing in data warehousing and big data
Parallel processing in data warehousing and big dataParallel processing in data warehousing and big data
Parallel processing in data warehousing and big data
 
Applications of parellel computing
Applications of parellel computingApplications of parellel computing
Applications of parellel computing
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replication
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
 
Veritas Failover3
Veritas Failover3Veritas Failover3
Veritas Failover3
 
Azure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdfAzure BI Cloud Architectural Guidelines.pdf
Azure BI Cloud Architectural Guidelines.pdf
 
Vargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbtVargas polyglot-persistence-cloud-edbt
Vargas polyglot-persistence-cloud-edbt
 
data mining
data miningdata mining
data mining
 
data mining
data miningdata mining
data mining
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Scalability Considerations
Scalability ConsiderationsScalability Considerations
Scalability Considerations
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learned
 
Ebook5
Ebook5Ebook5
Ebook5
 
Sql interview question part 5
Sql interview question part 5Sql interview question part 5
Sql interview question part 5
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Database project
Database projectDatabase project
Database project
 
Database project edi
Database project ediDatabase project edi
Database project edi
 

Último

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Último (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Building High Performance MySql Query Systems And Analytic Applications

  • 1. Building High-Performance MySQL Query Systems and Analytic Applications Robin Schumacher
  • 2.
  • 3.
  • 4.
  • 5. Data Warehouses/Marts/Analytic DB’s OLTP Files/XML Log Files Operational Source Data Staging or ODS ETL Final ETL Reporting, BI, Notification Layer Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Warehouse Archive Purge/Archive Data Warehouse and Metadata Management
  • 6. Reporting Databases OLTP Database Read Shard One Reporting Database Application Servers End Users ETL Data Archiving Link Replication
  • 7.
  • 8. Read Sharding / Partitioning
  • 9. What are the core rules to follow in order to avoid anxiety over building fast read-intensive, reporting, and analytic databases?
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Good suggestions, but how can I practically do all these things…?
  • 18. What is Calpont’s InfiniDB? InfiniDB is an open source, column-oriented database architected to handle data warehouses, data marts, analytic/BI systems, and other read-intensive applications. It delivers true scale up (more CPU’s/cores, RAM) and massive parallel processing (MPP) scale out capabilities for MySQL users. Linear performance gains are achieved when adding either more capabilities to one box or using commodity machines in a scale out configuration. Scale up Scale Out
  • 19.
  • 20. Column vs. Row Orientation A column-oriented architecture looks the same on the surface, but stores data differently than legacy/row-based databases…
  • 21.
  • 22. InfiniDB Community – Scale Up InfiniDB Community edition is a FOSS, multi-threaded database server that is capable of using a machine’s CPUs/cores to process queries 87% 22.14 164.12 Q3.2 83% 55.04 316.79 Q3.1 87% 15.94 121.33 Q2.3 87% 19.70 151.20 Q2.2 79% 44.65 210.21 Q2.1 Overall Percent Reduction with additional cores InfiniDB 8 cores (elapsed time in seconds) InfiniDB 1 Core (elapsed time in seconds) SSB Query (@100 scale)
  • 23.
  • 24. InfiniDB Enterprise – Scale Up and Out User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog
  • 25.
  • 26. #4 Scale both I/O and User Connections Recommendation : Use modular architecture User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog Add more Performance Modules to scale I/O Add more User Modules to scale concurrency
  • 27.
  • 28. #5 Provide Transparent Expansion and Failover Cust_id 1-999 Cust_id 1000-1999 Cust_id 2000-2999 Sharding Architecture MySQL Replication Web/App Servers Browsers
  • 29. #5 Provide Transparent Expansion and Failover User Connections User Module 1 User Module n Performance Module 1 Performance Module n Performance Module 2 Shared Storage Database files, System Catalog If one Performance Module fails, traffic resumes with the remaining nodes User queries can be redirected to other User Modules if one fails
  • 30.
  • 31. #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data Staging or ODS ETL High-speed Load Utility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management
  • 32.
  • 33. InfiniDB Extent Map – No Indexing Needed If a column WHERE filter of “COL1 BETWEEN 220 AND 250 AND COL2 < 10000” is specified, InfiniDB will eliminate extents 1, 2 and 4 from the first column filter, then, looking at just the matching extents for COL2 (i.e. just extent 3), it will determine that no extents match and return zero rows without doing any I/O at all. … Extent Map Also enables logical range partitioning of data… Ext 2 Min 101 Max 200 Ext 3 Min 201 Max 300 Ext 4 Min 301 Max 400 Col1 Ext 1 Min 1 Max 100 Ext 2 Min 10100 Max 20000 Ext 3 Min 20100 Max 30000 Ext 4 Min 30100 Max 40000 Col2 Ext 1 Min 100 Max 10000
  • 34. Summary Provides both diagnostic and tracing tools; no major design tuning efforts Use load testing and SQL analysis tools Method for troubleshooting poor read performance Has high-speed loader with no blocking and MVCC Use two-step ETL and bulk load process Load data with minimal impact Does transparent failover for I/O and manual for connectivity Use replication and load balancers Provide transparent expansion and failover Modular architecture for scaling both concurrency and I/O Application partition Scale concurrency and I/O Supports MPP scale out Spread load via replication or MPP Divide and Conquer Is multi-threaded and uses multiple CPUs / Cores Use DB’s/storage engines that are multi-threaded Exploit modern hardware Is column-oriented Use column database Only read the data you need InfiniDB General Technique Recommendation
  • 35. Calpont Solutions Calpont Analytic Database Server Editions Calpont Analytic Database Solutions InfiniDB Community Server Column-Oriented Multi-threaded Terabyte Capable Single Server InfiniDB Enterprise Server Scale out / Parallel Processing Automatic Failover InfiniDB Enterprise Solution Monitoring 24x7 Support Auto Patch Management Alerts & SNMP Notifications Hot Fix Builds Consultative Help
  • 36. InfiniDB Community & Enterprise Server Comparison Yes No Multi-Node, MPP scale out capable w/ failover Formal Production Support Forums Only Support Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes InfiniDB Community Yes INSERT/UPDATE/DELETE (DML) support Yes Transaction support (ACID compliant) Yes MySQL front end Yes Logical data compression Yes High-Speed bulk loader w/ no blocking queries while loading Yes Multi-threaded engine (queries/writes will use all CPU’s/cores on box) Yes Crash-recovery Yes Terabyte database capable Yes High concurrency supported Yes Alter Table with online add column capability Yes MVCC support – snapshot read (readers don’t block writers) Yes Automatic vertical (column) and logical horizontal partitioning of data Yes No indexing necessary Yes Column-oriented InfiniDB Enterprise Core Database Server Features
  • 37.
  • 38. Building High-Performance MySQL Query Systems and Analytic Applications Thanks…!