SlideShare a Scribd company logo
1 of 41
MemSQL 201: Advanced Tips & Tricks
Alec Powell, Solutions Engineer, MemSQL
January 2018
alec@memsql.com
Webinar Agenda
Rowstore vs Columnstore
Data Ingestion
Data Sharding & Query Tuning
Memory & Workload Management
Rowstore vs Columnstore
Making the most of MemSQL’s two
storage models
Streaming Database
Real-Time Pipelines, OLTP, and OLAP
Real-time
Pipelines
High Volume
Transactions
OLTP
Fast, Scalable
SQL Analytics
OLAP
Data
Warehouse
Streaming Database
MemSQL Features Multiple Table Types
Memory and
Disk Columnstore
In-Memory
Rowstore
Data
Warehouse
Streaming Database
The Rowstore and Columnstore Span Memory to Disk
Memory and
Disk Columnstore
RAM and SSDs
In-Memory
Rowstore
RAM
Relational
JSON
Key Value
Geospatial
Data
Warehouse
Streaming Database
Both Table Types are Persistent
Memory and
Disk Columnstore
SSDs and HDDs
In-Memory
Rowstore
Persists
to SSD for
durability
Data
Warehouse
In-Memory Rowstore Flash, SSD or Disk-based Columnstore
Operational/transactional workloads Analytical workloads
Single-record insert performance Batched load performance
Random seek performance Fast aggregations and table scans
Updates are frequent Updates are rare
Any types of deletes Deletes that remove large # of rows
MemSQL allows joining rowstore and columnstore data in a single query
When to use Rowstore and Columnstore
Our star schema
Example Query
SELECT
dim_supplier.supplier_address,
SUM(fact_supply_order.quantity) AS quantity_sold
FROM
fact_supply_order
INNER JOIN dim_product ON fact_supply_order.product_id = dim_product.product_id
INNER JOIN dim_time ON fact_supply_order.time_id = dim_time.time_id
INNER JOIN dim_supplier ON fact_supply_order.supplier_id = dim_supplier.supplier_id
WHERE
dim_time.action_year = 2016
AND dim_supplier.city = ‘Topeka’
AND dim_product.product_type = ‘Aspirin’
GROUP BY
dim_supplier.supplier_id,
dim_supplier.supplier_address;
Columnstore sort key
memsql> CREATE TABLE fact_supply_order (
-> product_id INT PRIMARY KEY,
-> time_id INT,
-> supplier_id INT,
-> employee_id INT,
-> price DECIMAL(8,2),
-> quantity DECIMAL(8,2),
-> KEY (time_id, product_id, supplier_id)
-> USING CLUSTERED COLUMNSTORE);
Data Ingestion
Real-time data loading with
MemSQL Pipelines
Streaming Database
Real-Time
Pipelines
MemSQL Pipelines Simplifies Real-Time Data Pipelines
ColumnstoreRowstore
Data
Warehouse
Streaming Database
Stream into the Rowstore or Columnstore
Real-Time
Pipelines
streams directly
into the Rowstore
or the Columnstore
ColumnstoreRowstore
Data
Warehouse
Pipelines enables partition-level Parallelism
Leaf 1
Leaf 2
Leaf 3
Leaf 4
Loading our table using S3 Pipelines
memsql> CREATE PIPELINE orders_pipeline AS
-> LOAD DATA S3 ”deloy.test/alec/orders-history”
-> CREDENTIALS ‘{redacted}’
-> SKIP ALL ERRORS
-> INTO TABLE fact_supply_order;
Query OK, (0.89 sec)
memsql> START PIPELINE orders_pipeline;
Query OK, (0.01 sec)
memsql> SELECT count(*) from fact_supply_order;
Sharding & Query Tuning
Understanding the distributed
system
MemSQL has aggregator and leaf nodes
LeafLeafLeafLeaf
Agg
Aggregator
Master
Aggregator
Database clients connect to aggregators
AggregatorAggregator
LeafLeafLeafLeaf
PARTITIONS PARTITIONS PARTITIONS PARTITIONS
Database Client
Leaf nodes store and process data in partitions
AggregatorAggregator
LeafLeafLeafLeaf
PARTITIONS PARTITIONS PARTITIONS PARTITIONS
Designing a Schema: Shard Keys
 Every distributed table has 1 shard key
• Non-unique key OK (eg. SHARD KEY (id, click_id, user_id))
 Determines the partition to which a row belongs
 If not specified, PRIMARY KEY is used.
 If no primary key, it will be empty (i.e. randomly distribute).
 Equality on all shard key columns → single partition query
 Most queries are not like this → query all partitions
HASH(“12345”) % NUM_PARTITIONS = 17
Great for Analytical Queries:
 Large aggregations
 Parallel processing
Critical for Transactional Queries:
 Selecting Single Rows
 High Concurrency
Fanout Queries
Agg 1 Agg 2
Leaf 1 Leaf 2 Leaf 3 Leaf 4
Agg 1 Agg 2
Leaf 1 Leaf 2 Leaf 3 Leaf 4
Single Partition Queries
Distributed Joins
memsql> select * from A join B where A.color = B.color
Distributed Joins
 Queries with joins that do not
match or filter on the shard key
will cause network overhead
 Reshuffle vs Broadcast operators
• Reshuffle: re-shard the data of the
smaller table (or result table) to
evenly match the large table
• Broadcast: send the entire small
table to the other nodes to complete
the join.
How to eliminate the overhead of distributed joins?
 Match on shard key → local join
 Reference tables to the rescue
• Each row replicated to all nodes
• Small data sizes, low # updates
Our star schema
Reference tables
Query tuning: EXPLAIN and PROFILE
 EXPLAIN
• Prints the MemSQL optimizer’s query plan.
• All MemSQL operators for the query are here:
 TableScan, IndexSeek, HashJoin, Repartition, Broadcast, etc.
 PROFILE
• Runs the query based on plan, timing each execution step
• SHOW PROFILE;
 Prints output of query plan execution statistics (memory usage,
execution time, rows scanned, segments skipped)
Query
EXPLAIN SELECT
dim_store.store_address,
SUM(fact_sales.quantity) AS quantity_sold
FROM
fact_sales
INNER JOIN dim_product ON fact_sales.product_id = dim_product.product_id
INNER JOIN dim_time ON fact_sales.time_id = dim_time.time_id
INNER JOIN dim_store ON fact_sales.store_id = dim_store.store_id
WHERE
dim_time.action_year = 2016
AND dim_store.city = ‘Topeka’
AND dim_product.product_type = ‘Aspirin’
GROUP BY
dim_store.store_id,
dim_store.store_address;
ANALYZE and OPTIMIZE
 ANALYZE TABLE
• Calculates table statistics
• Recommended after significant increase/refresh of data
 OPTIMIZE TABLE [FULL | FLUSH]
• FULL: Sorts based on primary key (optimal index scans)
• FLUSH (Columnstore only): Flushes in-memory segment to disk
 Recommended periodically after large loads
Memory & Workload Management
Monitoring your MemSQL
Deployment
Monitoring memory usage
memsql> SHOW STATUS EXTENDED;
memsql> SELECT database_name, table_name, SUM(rows) AS total_rows,
SUM(memory_use)/(1024*1024*1024) AS total_memory_gb,
SUM(memory_use) / SUM(rows) AS bytes_per_row
FROM information_schema.table_statistics
WHERE database_name=“memsql_webinar”
GROUP BY 1, 2 ORDER BY total_memory_gb DESC;
33
Monitoring workload with Management Views
• Set of tables in information_schema database that are
useful for troubleshooting query performance
• Shows resource usage of recent activities across all
nodes in MemSQL cluster
• Activities are categorized into Query, Database, System
• Query: Application or Person querying MemSQL
• Database: Replication Activity, Log Flusher
• System: Garbage Collector, Read and Execute Loops
• Available in Versions 5.8 and greater - must set a global
variable
• read_advanced_counters = ‘ON’
• memsql-ops memsql-update-config --set-global --key read_advanced_counters
--value ‘ON’ --all
Management Views Tables
SHOW tables in information_schema like "MV_%";
Management Views Metrics
These metrics are available for each activity on the cluster:
▪ CPU Time
▪ CPU Wait Time
▪ Memory Bytes
▪ Disk Bytes (Read/Write)
▪ Network Bytes (Send/Receive)
▪ Lock Wait Time
▪ Disk Wait Time
▪ Network Wait Time
▪ Failure Time
What is the most frequent activity type on each
node?
memsql> select node_id, activity_type, count(*)
from mv_activities_extended activities
inner join mv_nodes nodes on nodes.id = activities.node_id
group by 1, 2 order by 2 DESC;
Which partitions are using the most memory?
memsql> select partition_id, sum(memory_bs)
from mv_activities_extended
where partition_id != "NULL"
group by 1 order by 2 limit 5;
What query activities are using the most CPU?
memsql> select activities.cpu_time_ms, activities.activity_name,
LEFT(query.query_text,20)
from mv_activities activities inner join mv_queries query
on query.activity_name= activities.activity_name
order by cpu_time_ms DESC limit 5;
Thank you
memsql.com
Any other questions?
MemSQL Tech Office Hours
1/31 9am–5pm (PST)
https://calendly.com/alec-
powell/30min/01-31-2018
MemSQL 201: Advanced Tips and Tricks Webcast

More Related Content

What's hot

ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides Altinity Ltd
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registryconfluent
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High AvailabilityRobert Sanders
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11Kenny Gryp
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)KafkaZone
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline DevelopmentTimothy Spann
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...HostedbyConfluent
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changesconfluent
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on KubernetesBruno Borges
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debeziumKasun Don
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 

What's hot (20)

ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Airflow Clustering and High Availability
Airflow Clustering and High AvailabilityAirflow Clustering and High Availability
Airflow Clustering and High Availability
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changes
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes[Outdated] Secrets of Performance Tuning Java on Kubernetes
[Outdated] Secrets of Performance Tuning Java on Kubernetes
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 

Similar to MemSQL 201: Advanced Tips and Tricks Webcast

Optimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleOptimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleRightScale
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Antonios Chatzipavlis
 
Getting to Know MySQL Enterprise Monitor
Getting to Know MySQL Enterprise MonitorGetting to Know MySQL Enterprise Monitor
Getting to Know MySQL Enterprise MonitorMark Leith
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
SQL Server 2014 Mission Critical Performance - Level 300 Presentation
SQL Server 2014 Mission Critical Performance - Level 300 PresentationSQL Server 2014 Mission Critical Performance - Level 300 Presentation
SQL Server 2014 Mission Critical Performance - Level 300 PresentationDavid J Rosenthal
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance TuningBala Subra
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cRonald Francisco Vargas Quesada
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabaseTung Nguyen Thanh
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuningJugal Shah
 
Informix partitioning interval_rolling_window_table
Informix partitioning interval_rolling_window_tableInformix partitioning interval_rolling_window_table
Informix partitioning interval_rolling_window_tableKeshav Murthy
 
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerGeek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerIDERA Software
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014netmind
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 
R12 d49656 gc10-apps dba 07
R12 d49656 gc10-apps dba 07R12 d49656 gc10-apps dba 07
R12 d49656 gc10-apps dba 07zeesniper
 

Similar to MemSQL 201: Advanced Tips and Tricks Webcast (20)

Optimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleOptimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScale
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019
 
Getting to Know MySQL Enterprise Monitor
Getting to Know MySQL Enterprise MonitorGetting to Know MySQL Enterprise Monitor
Getting to Know MySQL Enterprise Monitor
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Sql Server
Sql ServerSql Server
Sql Server
 
SQL Server 2014 Mission Critical Performance - Level 300 Presentation
SQL Server 2014 Mission Critical Performance - Level 300 PresentationSQL Server 2014 Mission Critical Performance - Level 300 Presentation
SQL Server 2014 Mission Critical Performance - Level 300 Presentation
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL Database
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
 
Informix partitioning interval_rolling_window_table
Informix partitioning interval_rolling_window_tableInformix partitioning interval_rolling_window_table
Informix partitioning interval_rolling_window_table
 
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerGeek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
R12 d49656 gc10-apps dba 07
R12 d49656 gc10-apps dba 07R12 d49656 gc10-apps dba 07
R12 d49656 gc10-apps dba 07
 
Mysql For Developers
Mysql For DevelopersMysql For Developers
Mysql For Developers
 

More from SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeSingleStore
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsSingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeSingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLSingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondSingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudSingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 

More from SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 

Recently uploaded

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationShrmpro
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 

Recently uploaded (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 

MemSQL 201: Advanced Tips and Tricks Webcast

  • 1. MemSQL 201: Advanced Tips & Tricks Alec Powell, Solutions Engineer, MemSQL January 2018 alec@memsql.com
  • 2. Webinar Agenda Rowstore vs Columnstore Data Ingestion Data Sharding & Query Tuning Memory & Workload Management
  • 3. Rowstore vs Columnstore Making the most of MemSQL’s two storage models
  • 4. Streaming Database Real-Time Pipelines, OLTP, and OLAP Real-time Pipelines High Volume Transactions OLTP Fast, Scalable SQL Analytics OLAP Data Warehouse
  • 5. Streaming Database MemSQL Features Multiple Table Types Memory and Disk Columnstore In-Memory Rowstore Data Warehouse
  • 6. Streaming Database The Rowstore and Columnstore Span Memory to Disk Memory and Disk Columnstore RAM and SSDs In-Memory Rowstore RAM Relational JSON Key Value Geospatial Data Warehouse
  • 7. Streaming Database Both Table Types are Persistent Memory and Disk Columnstore SSDs and HDDs In-Memory Rowstore Persists to SSD for durability Data Warehouse
  • 8. In-Memory Rowstore Flash, SSD or Disk-based Columnstore Operational/transactional workloads Analytical workloads Single-record insert performance Batched load performance Random seek performance Fast aggregations and table scans Updates are frequent Updates are rare Any types of deletes Deletes that remove large # of rows MemSQL allows joining rowstore and columnstore data in a single query When to use Rowstore and Columnstore
  • 10. Example Query SELECT dim_supplier.supplier_address, SUM(fact_supply_order.quantity) AS quantity_sold FROM fact_supply_order INNER JOIN dim_product ON fact_supply_order.product_id = dim_product.product_id INNER JOIN dim_time ON fact_supply_order.time_id = dim_time.time_id INNER JOIN dim_supplier ON fact_supply_order.supplier_id = dim_supplier.supplier_id WHERE dim_time.action_year = 2016 AND dim_supplier.city = ‘Topeka’ AND dim_product.product_type = ‘Aspirin’ GROUP BY dim_supplier.supplier_id, dim_supplier.supplier_address;
  • 11. Columnstore sort key memsql> CREATE TABLE fact_supply_order ( -> product_id INT PRIMARY KEY, -> time_id INT, -> supplier_id INT, -> employee_id INT, -> price DECIMAL(8,2), -> quantity DECIMAL(8,2), -> KEY (time_id, product_id, supplier_id) -> USING CLUSTERED COLUMNSTORE);
  • 12.
  • 13. Data Ingestion Real-time data loading with MemSQL Pipelines
  • 14. Streaming Database Real-Time Pipelines MemSQL Pipelines Simplifies Real-Time Data Pipelines ColumnstoreRowstore Data Warehouse
  • 15. Streaming Database Stream into the Rowstore or Columnstore Real-Time Pipelines streams directly into the Rowstore or the Columnstore ColumnstoreRowstore Data Warehouse
  • 16. Pipelines enables partition-level Parallelism Leaf 1 Leaf 2 Leaf 3 Leaf 4
  • 17. Loading our table using S3 Pipelines memsql> CREATE PIPELINE orders_pipeline AS -> LOAD DATA S3 ”deloy.test/alec/orders-history” -> CREDENTIALS ‘{redacted}’ -> SKIP ALL ERRORS -> INTO TABLE fact_supply_order; Query OK, (0.89 sec) memsql> START PIPELINE orders_pipeline; Query OK, (0.01 sec) memsql> SELECT count(*) from fact_supply_order;
  • 18. Sharding & Query Tuning Understanding the distributed system
  • 19. MemSQL has aggregator and leaf nodes LeafLeafLeafLeaf Agg Aggregator Master Aggregator
  • 20. Database clients connect to aggregators AggregatorAggregator LeafLeafLeafLeaf PARTITIONS PARTITIONS PARTITIONS PARTITIONS Database Client
  • 21. Leaf nodes store and process data in partitions AggregatorAggregator LeafLeafLeafLeaf PARTITIONS PARTITIONS PARTITIONS PARTITIONS
  • 22. Designing a Schema: Shard Keys  Every distributed table has 1 shard key • Non-unique key OK (eg. SHARD KEY (id, click_id, user_id))  Determines the partition to which a row belongs  If not specified, PRIMARY KEY is used.  If no primary key, it will be empty (i.e. randomly distribute).  Equality on all shard key columns → single partition query  Most queries are not like this → query all partitions HASH(“12345”) % NUM_PARTITIONS = 17
  • 23. Great for Analytical Queries:  Large aggregations  Parallel processing Critical for Transactional Queries:  Selecting Single Rows  High Concurrency Fanout Queries Agg 1 Agg 2 Leaf 1 Leaf 2 Leaf 3 Leaf 4 Agg 1 Agg 2 Leaf 1 Leaf 2 Leaf 3 Leaf 4 Single Partition Queries
  • 24. Distributed Joins memsql> select * from A join B where A.color = B.color
  • 25. Distributed Joins  Queries with joins that do not match or filter on the shard key will cause network overhead  Reshuffle vs Broadcast operators • Reshuffle: re-shard the data of the smaller table (or result table) to evenly match the large table • Broadcast: send the entire small table to the other nodes to complete the join.
  • 26. How to eliminate the overhead of distributed joins?  Match on shard key → local join  Reference tables to the rescue • Each row replicated to all nodes • Small data sizes, low # updates
  • 28. Query tuning: EXPLAIN and PROFILE  EXPLAIN • Prints the MemSQL optimizer’s query plan. • All MemSQL operators for the query are here:  TableScan, IndexSeek, HashJoin, Repartition, Broadcast, etc.  PROFILE • Runs the query based on plan, timing each execution step • SHOW PROFILE;  Prints output of query plan execution statistics (memory usage, execution time, rows scanned, segments skipped)
  • 29. Query EXPLAIN SELECT dim_store.store_address, SUM(fact_sales.quantity) AS quantity_sold FROM fact_sales INNER JOIN dim_product ON fact_sales.product_id = dim_product.product_id INNER JOIN dim_time ON fact_sales.time_id = dim_time.time_id INNER JOIN dim_store ON fact_sales.store_id = dim_store.store_id WHERE dim_time.action_year = 2016 AND dim_store.city = ‘Topeka’ AND dim_product.product_type = ‘Aspirin’ GROUP BY dim_store.store_id, dim_store.store_address;
  • 30. ANALYZE and OPTIMIZE  ANALYZE TABLE • Calculates table statistics • Recommended after significant increase/refresh of data  OPTIMIZE TABLE [FULL | FLUSH] • FULL: Sorts based on primary key (optimal index scans) • FLUSH (Columnstore only): Flushes in-memory segment to disk  Recommended periodically after large loads
  • 31. Memory & Workload Management Monitoring your MemSQL Deployment
  • 32. Monitoring memory usage memsql> SHOW STATUS EXTENDED; memsql> SELECT database_name, table_name, SUM(rows) AS total_rows, SUM(memory_use)/(1024*1024*1024) AS total_memory_gb, SUM(memory_use) / SUM(rows) AS bytes_per_row FROM information_schema.table_statistics WHERE database_name=“memsql_webinar” GROUP BY 1, 2 ORDER BY total_memory_gb DESC;
  • 33. 33 Monitoring workload with Management Views • Set of tables in information_schema database that are useful for troubleshooting query performance • Shows resource usage of recent activities across all nodes in MemSQL cluster • Activities are categorized into Query, Database, System • Query: Application or Person querying MemSQL • Database: Replication Activity, Log Flusher • System: Garbage Collector, Read and Execute Loops • Available in Versions 5.8 and greater - must set a global variable • read_advanced_counters = ‘ON’ • memsql-ops memsql-update-config --set-global --key read_advanced_counters --value ‘ON’ --all
  • 34. Management Views Tables SHOW tables in information_schema like "MV_%";
  • 35. Management Views Metrics These metrics are available for each activity on the cluster: ▪ CPU Time ▪ CPU Wait Time ▪ Memory Bytes ▪ Disk Bytes (Read/Write) ▪ Network Bytes (Send/Receive) ▪ Lock Wait Time ▪ Disk Wait Time ▪ Network Wait Time ▪ Failure Time
  • 36. What is the most frequent activity type on each node? memsql> select node_id, activity_type, count(*) from mv_activities_extended activities inner join mv_nodes nodes on nodes.id = activities.node_id group by 1, 2 order by 2 DESC;
  • 37. Which partitions are using the most memory? memsql> select partition_id, sum(memory_bs) from mv_activities_extended where partition_id != "NULL" group by 1 order by 2 limit 5;
  • 38. What query activities are using the most CPU? memsql> select activities.cpu_time_ms, activities.activity_name, LEFT(query.query_text,20) from mv_activities activities inner join mv_queries query on query.activity_name= activities.activity_name order by cpu_time_ms DESC limit 5;
  • 40. Any other questions? MemSQL Tech Office Hours 1/31 9am–5pm (PST) https://calendly.com/alec- powell/30min/01-31-2018