SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
Practical MySQL indexing guidelines


FOSDEM
February 5th, 2012                 Stéphane Combaudon
Brussels               stephane.combaudon@dailymotion.com
Agenda

   Introduction
   Bad indexes & performance drops
   Guidelines for efficient indexing
   Tools and methods to improve index usage




                                               2
Introduction



               3
Goals

   Having fun with indexes!!!

   Getting rid of trial-and-error approach

   Knowing performance penalty of bad indexes

   Being productive
       Knowing simple rules to design indexes
       Knowing tools that can help              4
Indexing basics

   Index: data structure to speed up SELECTs
       Think of an index in a book
       In MySQL, key = index
       We'll consider that indexes are trees

   InnoDB's clustered index
       Data is stored with the PK: PK lookups are fast
       Secondary keys hold the PK values
       Designing InnoDB's PKs with care is critical for perf.
                                                             5
Strengths

   An index can filter and/or sort values

   An index can contain all the fields needed for a
    query
       No need to access data anymore


   A leftmost prefix can be used
       Indexes on several columns are useful
       Order of columns in composite keys is important
                                                          6
Limitations

   MySQL only uses 1 index per table per query
       Ok, that's not 100% true (OR clauses...)
       Think of composite indexes when you can!!

   Can't index full TEXT fields
       You must use a prefix
       Same for BLOBS and long VARCHARs

   Maintaining an index has a cost
       Read speed vs write speed                   7
Sample table we will use

 CREATE TABLE t (
   id INT NOT NULL AUTO_INCREMENT,
   a INT NOT NULL DEFAULT 0,
   b INT NOT NULL DEFAULT 0,
   [more columns here]
   PRIMARY KEY(id)
 )ENGINE=InnoDB;


   Populated with ”many” rows
       Means that queries against t are ”slow”

   Replace ”many” and ”slow” with your own values
                                                     8
Bad indexes &
Performance drops


                    9
Adding an index

   3 main consequences:
       Can speed up queries (good)
       Increases the size of your dataset (bad)
       Slows down writes (bad)


   How big is the write slow-down?
       Let's have simple tests



                                                   10
Write slow-downs, pictured
                                   In-memory test

                    300                                          Inserting N rows in PK order
                    250
                                                                 Baseline is 100 with PK only
Time to load data




                    200
                                                    PK + 1 idx   for both graphs
                    150                             PK + 2 idx
                                                    PK + 3 idx
                    100

                    50
                                                                 For in-memory workloads,
                                                                 adding 2 keys makes perf.
                     0

                              Number of indexes
                                                                 2x worse

                                    On-disk test

                    12000

                    10000

                                                                 For on-disk workloads,
Time to load data




                    8000
                                                    PK + 1 idx
                    6000                            PK + 2 idx   adding 2 keys make perf.
                                                    PK + 3 idx   40x worse!!
                    4000

                    2000

                          0
                                                                                         11
                              Number of indexes
So what?

   Removing bad indexes is crucial for perf.
       Especially for write-intensive workloads
       Tools will help us, more on this later


   What if your workload is read-intensive?
       A few hot tables may handle most of the writes
       These tables will be write-intensive



                                                         12
Identifying bad indexes

   Before removing bad indexes, we have to
    identify them!

   What is a bad index?
       Duplicate indexes: always bad
       Redondant indexes: generally bad
       Low-cardinality indexes: depends
       Unused indexes: always bad

                                              13
Guidelines for efficient indexes



                                   14
Before we start...

   Indexing is not an exact science
       But guessing is not the best way to design indexes

   A few simple rules will help 90% of the time

   Always check your assumptions
       EXPLAIN does not tell you everything
       Time your queries with different index combinations
       SHOW PROFILES is often valuable

   Slow query log is a good place to start!                 15
Rule #1: Filter

Q1: SELECT * FROM t WHERE a = 10 AND b = 20
   Without an index, always a full table scan
1. mysql> EXPLAIN SELECT * FROM t WHERE a = 10 AND b = 20G
2. *********** 1. row ***********
3.            id: 1
4.   select_type: SIMPLE
5.         table: t
                                            ALL means
6.          type: ALL
                                             table scan
7. possible_keys: NULL
8.           key: NULL
9.       key_len: NULL
10.          ref: NULL
                                           Estimated #
12.         rows: 1000545
                                          of rows to read
12.        Extra: Using where


                                    Post-filtering needed
                             to discard the non-matching rows   16
Rule #1: Filter

   Idea: filter as much data as possible by
    focusing on the WHERE clause

   Candidates for Q1:
       key(a), key(b), key(a,b), key(b,a)

   Condition is on both a and b with an AND
       A composite index should be better
       Let's test!
                                               17
Rule #1: Filter
1. mysql> EXPLAIN SELECT * ...       1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row **********      2. ********** 1. row **********
3.            [...]                  3.            [...]
4.           key: a                  4.           key: b
5.       key_len: 4                  5.       key_len: 4
6.            [...]                  6.            [...]
7.          rows: 20                 7.          rows: 67368
Exec time: 0.00s                     Exec time: 0.20s

1. mysql> EXPLAIN SELECT * ...       1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row **********      2. ********** 1. row **********
3.            [...]                  3.            [...]
4.           key: ab                 4.           key: ba
5.       key_len: 8                  5.       key_len: 8
6.            [...]                  6.            [...]
7.          rows: 10                 7.          rows: 10
Exec time: 0.00s                     Exec time: 0.00s

                      Same perf. for this query
                     Other queries will guide us                      18
                      to choose between them
Rule #2: Sort

Q2: SELECT * FROM t WHERE a = 10 ORDER BY b
   Remember: indexed values are sorted

   An index can avoid costly filesorts
       Think of filesorts performed on on-disk temp tables
       ORDER BY clause must be a leftmost prefix of the
        index

   Caveat: an index scan is fast in itself, but
    retrieving the rows in index order may be slow
       Seq. scan on index but random access on table         19
Rule #2: Sort

    Let's try key(b) for Q2 vs full table scan

1. mysql> EXPLAIN SELECT * ...      1. mysql> EXPLAIN SELECT * ...
2. ********** 1. row **********     2. ********** 1. row **********
3.            […]                   3.            […]
4.          type: index             4.          type: ALL
5.           key: b                 4.           key: NULL
6.       key_len: 4                 5.       key_len: NULL
7.            [...]                 6.            [...]
8.          rows: 1000638           7.          rows: 1000638
9.         Extra: Using where       8.         Extra: Using where;    
                                                      Using filesort

        Exec time: 1.52s                        Exec time: 0.37s

                           EXPLAIN suggest
                            key(b) is better,
                                                                     20
                             but it's wrong!
Rule #2: Sort

   An index is not always the best for sorting

   If possible, try to sort and filter

   Exception to the leftmost prefix rule:
       Leading columns appearing in the WHERE clause
        as constants can fill the holes in the index
       WHERE a = 10 ORDER BY b: key(a,b) can
        filter and sort
       Not true with WHERE a > 10 ORDER BY b
                                                        21
Rule #2: Sort

     With key(a,b)
1. mysql> EXPLAIN SELECT * FROM t     1. mysql> EXPLAIN SELECT * FROM t
WHERE a = 10 ORDER BY bG             WHERE a > 10 ORDER BY bG
2. ********** 1. row **********       2. ********** 1. row **********
3.            [...]                   3.            [...]
4.          type: ref                 4.          type: ALL
5.           key: ab                  5.           key: NULL
6.       key_len: 8                   6.       key_len: NULL
7.            [...]                   7.            [...]
8.          rows: 20                  8.          rows: 1000638
9.         Extra:                     9.         Extra: Using where;    
                                                        Using filesort


       Could have been a range scan
        Depends on the distribution
               of the values
                                                                    22
Rule #3: Cover

Q3: SELECT a,b FROM t WHERE a > 100;

   With key(a), you filter efficiently

   But with key(a,b)
       You filter
       The index holds all the columns you need
       Means you don't need to access data

   key(a,b) is a covering index
                                                   23
Rule #3: Cover

   Back to InnoDB's clustered index
       It is always covering
       SELECT by PK is the fastest access with InnoDB
       Take care of your PKs!!


   Remember full table scan + filesort vs index?
       If the index used for sorting is also covering, it will
        outperform the table scan

                                                                  24
Rating an index

   An index can give you 3 benefits: filtering,
    sorting, covering

   1-star index: 1 property
   2-star index: 2 properties
   3-star index: 3 properties

   This is my own rating, other systems exist
                                                   25
Range queries and ORDER BY

Q4: SELECT * FROM t WHERE a > 10 and b = 20 ORDER BY a
mysql> EXPLAIN SELECT * ...G
********** 1. row **********
           [...]                     Key filters and sorts, but
         type: range                  filtering is not efficient.
          key: a                     Getting data is very slow
         rows: 500319             (random access + I/O-bound)
        Extra: Using where
Exec time: 35.9s

mysql> EXPLAIN SELECT * ...G
********** 1. row **********
           [...]
         type: ref                   Key filters but doesn't sort.
possible_keys: a,b,ab,ba        Filtering is efficient so getting data,
          key: ba                   post-filtering and post-sorting
         rows: 64814                         is not too slow
        Extra: Using where;
                                                                      26
               Using filesort
Exec time: 0.2s
Joins and ORDER BY

   All columns in the ORDER BY clause must
    refer to the 1st table
       I mean the 1st table to be joined, not the 1st table
        written in your query!!


   Forcing the join order with SELECT
    STRAIGHT_JOIN is sometimes useful

   Sometimes you can't fulfill this condition
       This can be a good reason to denormalize               27
Tools and methods
to improve index usage


                         28
Userstats v2

   You need Percona Server or MariaDB 5.2+
mysql> SELECT s.table_name,s.index_name,rows_read
       FROM information_schema.statistics s
       LEFT JOIN information_schema.index_statistics i
       ON (i.table_schema=s.table_schema                       Table added by
           AND i.table_name=s.table_name                         this feature
           AND i.index_name=s.index_name)
       WHERE s.table_name='comment'
             AND s.table_schema='mydb'
             AND seq_in_index=1;                  Deals with
                                               composite indexes

+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| table_name | index_name           | rows_read |
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| comment    | PRIMARY              |     50361 |
| comment    | user_id              |      NULL |         Useless
| comment    | video_comment_idx    |  18276197 |
| comment    | created_language_idx |      NULL |         Useless
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
                                                                            29
Workload matters!

   OLAP server
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| table_name | index_name           | rows_read |
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| comment    | PRIMARY              |     50361 |
| comment    | user_id              |      NULL |   Never used
| comment    | video_comment_idx    |  18276197 |
| comment    | created_language_idx |      NULL |
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+

   OLTP server
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| table_name | index_name           | rows_read |
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| comment    | PRIMARY              |  12220798 |
| comment    | user_id              |  96674982 |    Useful!
| comment    | video_comment_idx    | 365691254 |
| comment    | created_language_idx |    217176 |
+­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
                                                                 30
Pros

   Very easy to use
       Turn on the variable and forget


   Easy to write queries to discover unused
    indexes automatically




                                               31
Cons

   Large sample period needed for accurate stats
   Not always obvious to say if index is useful
       Look at created_language_idx in previous slide
   Has some CPU overhead




                                                     32
pt-duplicate-key-checker

   Anything wrong with the keys?
CREATE TABLE comment (
  comment_id int(10) ... AUTO_INCREMENT,
  video_id int(10) ...,
  user_id int(10) ...,
  language char(2) ...,
  [...]
  PRIMARY KEY (comment_id),                                  Tool is aware
  KEY user_id (user_id),                                      of InnoDB's
  KEY video_comment_idx (video_id,language,comment_id)      clustered index!
) ENGINE=InnoDB;

$ pt­duplicate­key­checker u=root,h=localhost
[...]
# Key video_comment_idx ends with a prefix of the clustered index
# Key definitions:
#   KEY video_comment_idx (video_id,language,comment_id)
                                                             Query to remove
#   PRIMARY KEY (comment_id),
[...]
                                                                the index
# To shorten this duplicate clustered index, execute:
ALTER TABLE mydb.comment DROP INDEX video_comment_idx, ADD INDEX          33
video_comment_idx (video_id,language)
pt-index-usage

   Helps answer questions not solved by userstats
       Are there any queries with a changing exec plan?
       Is an index necessary for a query?


   Read a slow log file/general log file

   Can give you invaluable information on your
    index usage
       See the man page for more
                                                           34
   Thanks for your attention!

   Any questions?




                                 35

Más contenido relacionado

La actualidad más candente

The uniform interface is 42
The uniform interface is 42The uniform interface is 42
The uniform interface is 42Yevhen Bobrov
 
Advanced Windows Debugging
Advanced Windows DebuggingAdvanced Windows Debugging
Advanced Windows DebuggingBala Subra
 
Threads v3
Threads v3Threads v3
Threads v3Sunil OS
 
Bh Usa 07 Butler And Kendall
Bh Usa 07 Butler And KendallBh Usa 07 Butler And Kendall
Bh Usa 07 Butler And KendallKarlFrank99
 
Mssm及assm下索引叶块分裂的测试
Mssm及assm下索引叶块分裂的测试Mssm及assm下索引叶块分裂的测试
Mssm及assm下索引叶块分裂的测试maclean liu
 
Python 표준 라이브러리
Python 표준 라이브러리Python 표준 라이브러리
Python 표준 라이브러리용 최
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite PluginsSveta Smirnova
 
Rails Concurrency Gotchas
Rails Concurrency GotchasRails Concurrency Gotchas
Rails Concurrency Gotchasmarcostoledo
 
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfDatabase & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfInSync2011
 
The Ring programming language version 1.8 book - Part 9 of 202
The Ring programming language version 1.8 book - Part 9 of 202The Ring programming language version 1.8 book - Part 9 of 202
The Ring programming language version 1.8 book - Part 9 of 202Mahmoud Samir Fayed
 
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...Sveta Smirnova
 
The Ring programming language version 1.9 book - Part 11 of 210
The Ring programming language version 1.9 book - Part 11 of 210The Ring programming language version 1.9 book - Part 11 of 210
The Ring programming language version 1.9 book - Part 11 of 210Mahmoud Samir Fayed
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL ExplainMYXPLAIN
 

La actualidad más candente (16)

The uniform interface is 42
The uniform interface is 42The uniform interface is 42
The uniform interface is 42
 
Advanced Windows Debugging
Advanced Windows DebuggingAdvanced Windows Debugging
Advanced Windows Debugging
 
Threads v3
Threads v3Threads v3
Threads v3
 
Bh Usa 07 Butler And Kendall
Bh Usa 07 Butler And KendallBh Usa 07 Butler And Kendall
Bh Usa 07 Butler And Kendall
 
Mssm及assm下索引叶块分裂的测试
Mssm及assm下索引叶块分裂的测试Mssm及assm下索引叶块分裂的测试
Mssm及assm下索引叶块分裂的测试
 
Python 표준 라이브러리
Python 표준 라이브러리Python 표준 라이브러리
Python 표준 라이브러리
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
The walking 0xDEAD
The walking 0xDEADThe walking 0xDEAD
The walking 0xDEAD
 
Learn How to Master Solr1 4
Learn How to Master Solr1 4Learn How to Master Solr1 4
Learn How to Master Solr1 4
 
Rails Concurrency Gotchas
Rails Concurrency GotchasRails Concurrency Gotchas
Rails Concurrency Gotchas
 
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfDatabase & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
 
The Ring programming language version 1.8 book - Part 9 of 202
The Ring programming language version 1.8 book - Part 9 of 202The Ring programming language version 1.8 book - Part 9 of 202
The Ring programming language version 1.8 book - Part 9 of 202
 
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...
A Billion Goods in a Few Categories: When Optimizer Histograms Help and When ...
 
Java Generics - by Example
Java Generics - by ExampleJava Generics - by Example
Java Generics - by Example
 
The Ring programming language version 1.9 book - Part 11 of 210
The Ring programming language version 1.9 book - Part 11 of 210The Ring programming language version 1.9 book - Part 11 of 210
The Ring programming language version 1.9 book - Part 11 of 210
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL Explain
 

Destacado

Elevator Pitch
Elevator PitchElevator Pitch
Elevator Pitchconnorlowe
 
dare to be by john mason
dare to be by john masondare to be by john mason
dare to be by john masonMaria Apostol
 
Real-life Data Visualization - guest lecture for McGill INSY-442
Real-life Data Visualization - guest lecture for McGill INSY-442Real-life Data Visualization - guest lecture for McGill INSY-442
Real-life Data Visualization - guest lecture for McGill INSY-442Mike Deutsch
 
Narrative Acceptance Tests River Glide Antony Marcano
Narrative Acceptance Tests River Glide Antony MarcanoNarrative Acceptance Tests River Glide Antony Marcano
Narrative Acceptance Tests River Glide Antony MarcanoSkills Matter
 
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...Career Communications Group
 
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)Ontico
 
Nvish content management system blog
Nvish content management system blogNvish content management system blog
Nvish content management system blogNVISH Solutions
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysqlliufabin 66688
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

Destacado (15)

My MySQL SQL Presentation
My MySQL SQL PresentationMy MySQL SQL Presentation
My MySQL SQL Presentation
 
Elevator Pitch
Elevator PitchElevator Pitch
Elevator Pitch
 
dare to be by john mason
dare to be by john masondare to be by john mason
dare to be by john mason
 
Part iv
Part ivPart iv
Part iv
 
Part II
Part IIPart II
Part II
 
Real-life Data Visualization - guest lecture for McGill INSY-442
Real-life Data Visualization - guest lecture for McGill INSY-442Real-life Data Visualization - guest lecture for McGill INSY-442
Real-life Data Visualization - guest lecture for McGill INSY-442
 
20160419 CCASA
20160419 CCASA20160419 CCASA
20160419 CCASA
 
Narrative Acceptance Tests River Glide Antony Marcano
Narrative Acceptance Tests River Glide Antony MarcanoNarrative Acceptance Tests River Glide Antony Marcano
Narrative Acceptance Tests River Glide Antony Marcano
 
Wheelmap.org - Mapping für einen sozialen Zweck
Wheelmap.org - Mapping für einen sozialen ZweckWheelmap.org - Mapping für einen sozialen Zweck
Wheelmap.org - Mapping für einen sozialen Zweck
 
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...
Views and Voices! Men and Women Leaders Debate the Top 10 Issues and Advantag...
 
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
 
Nvish content management system blog
Nvish content management system blogNvish content management system blog
Nvish content management system blog
 
Learn from day one bsa
Learn from day one bsaLearn from day one bsa
Learn from day one bsa
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysql
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Similar a Fosdem 2012 practical_indexing

Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotMongoDB
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramChris Adkin
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index CookbookMYXPLAIN
 
Optimizing array-based data structures to the limit
Optimizing array-based data structures to the limitOptimizing array-based data structures to the limit
Optimizing array-based data structures to the limitRoman Leventov
 
The View - The top 30 Development tips
The View - The top 30 Development tipsThe View - The top 30 Development tips
The View - The top 30 Development tipsBill Buchan
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentalsChris Adkin
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
Geek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerGeek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerIDERA Software
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 20197 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019Dave Stokes
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesSeveralnines
 
Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Itamar Haber
 
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB Rakuten Group, Inc.
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the ServerdevObjective
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemyRyan Robson
 

Similar a Fosdem 2012 practical_indexing (20)

Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index Robot
 
Sql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ramSql server engine cpu cache as the new ram
Sql server engine cpu cache as the new ram
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index Cookbook
 
Optimizing array-based data structures to the limit
Optimizing array-based data structures to the limitOptimizing array-based data structures to the limit
Optimizing array-based data structures to the limit
 
The View - The top 30 Development tips
The View - The top 30 Development tipsThe View - The top 30 Development tips
The View - The top 30 Development tips
 
Sql server scalability fundamentals
Sql server scalability fundamentalsSql server scalability fundamentals
Sql server scalability fundamentals
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
Geek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL ServerGeek Sync I Consolidating Indexes in SQL Server
Geek Sync I Consolidating Indexes in SQL Server
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 20197 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
 
Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)
 
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB
 
Assembler
AssemblerAssembler
Assembler
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
 
My sql102
My sql102My sql102
My sql102
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
 
Sql killedserver
Sql killedserverSql killedserver
Sql killedserver
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Fosdem 2012 practical_indexing

  • 1. Practical MySQL indexing guidelines FOSDEM February 5th, 2012 Stéphane Combaudon Brussels stephane.combaudon@dailymotion.com
  • 2. Agenda  Introduction  Bad indexes & performance drops  Guidelines for efficient indexing  Tools and methods to improve index usage 2
  • 4. Goals  Having fun with indexes!!!  Getting rid of trial-and-error approach  Knowing performance penalty of bad indexes  Being productive  Knowing simple rules to design indexes  Knowing tools that can help 4
  • 5. Indexing basics  Index: data structure to speed up SELECTs  Think of an index in a book  In MySQL, key = index  We'll consider that indexes are trees  InnoDB's clustered index  Data is stored with the PK: PK lookups are fast  Secondary keys hold the PK values  Designing InnoDB's PKs with care is critical for perf. 5
  • 6. Strengths  An index can filter and/or sort values  An index can contain all the fields needed for a query  No need to access data anymore  A leftmost prefix can be used  Indexes on several columns are useful  Order of columns in composite keys is important 6
  • 7. Limitations  MySQL only uses 1 index per table per query  Ok, that's not 100% true (OR clauses...)  Think of composite indexes when you can!!  Can't index full TEXT fields  You must use a prefix  Same for BLOBS and long VARCHARs  Maintaining an index has a cost  Read speed vs write speed 7
  • 8. Sample table we will use  CREATE TABLE t (    id INT NOT NULL AUTO_INCREMENT,    a INT NOT NULL DEFAULT 0,    b INT NOT NULL DEFAULT 0,    [more columns here]    PRIMARY KEY(id)  )ENGINE=InnoDB;  Populated with ”many” rows  Means that queries against t are ”slow”  Replace ”many” and ”slow” with your own values 8
  • 10. Adding an index  3 main consequences:  Can speed up queries (good)  Increases the size of your dataset (bad)  Slows down writes (bad)  How big is the write slow-down?  Let's have simple tests 10
  • 11. Write slow-downs, pictured In-memory test 300 Inserting N rows in PK order 250 Baseline is 100 with PK only Time to load data 200 PK + 1 idx for both graphs 150 PK + 2 idx PK + 3 idx 100 50 For in-memory workloads, adding 2 keys makes perf. 0 Number of indexes 2x worse On-disk test 12000 10000 For on-disk workloads, Time to load data 8000 PK + 1 idx 6000 PK + 2 idx adding 2 keys make perf. PK + 3 idx 40x worse!! 4000 2000 0 11 Number of indexes
  • 12. So what?  Removing bad indexes is crucial for perf.  Especially for write-intensive workloads  Tools will help us, more on this later  What if your workload is read-intensive?  A few hot tables may handle most of the writes  These tables will be write-intensive 12
  • 13. Identifying bad indexes  Before removing bad indexes, we have to identify them!  What is a bad index?  Duplicate indexes: always bad  Redondant indexes: generally bad  Low-cardinality indexes: depends  Unused indexes: always bad 13
  • 15. Before we start...  Indexing is not an exact science  But guessing is not the best way to design indexes  A few simple rules will help 90% of the time  Always check your assumptions  EXPLAIN does not tell you everything  Time your queries with different index combinations  SHOW PROFILES is often valuable  Slow query log is a good place to start! 15
  • 16. Rule #1: Filter Q1: SELECT * FROM t WHERE a = 10 AND b = 20  Without an index, always a full table scan 1. mysql> EXPLAIN SELECT * FROM t WHERE a = 10 AND b = 20G 2. *********** 1. row *********** 3.            id: 1 4.   select_type: SIMPLE 5.         table: t ALL means 6.          type: ALL table scan 7. possible_keys: NULL 8.           key: NULL 9.       key_len: NULL 10.          ref: NULL Estimated # 12.         rows: 1000545 of rows to read 12.        Extra: Using where Post-filtering needed to discard the non-matching rows 16
  • 17. Rule #1: Filter  Idea: filter as much data as possible by focusing on the WHERE clause  Candidates for Q1:  key(a), key(b), key(a,b), key(b,a)  Condition is on both a and b with an AND  A composite index should be better  Let's test! 17
  • 18. Rule #1: Filter 1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ... 2. ********** 1. row ********** 2. ********** 1. row ********** 3.            [...] 3.            [...] 4.           key: a 4.           key: b 5.       key_len: 4 5.       key_len: 4 6.            [...] 6.            [...] 7.          rows: 20 7.          rows: 67368 Exec time: 0.00s Exec time: 0.20s 1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ... 2. ********** 1. row ********** 2. ********** 1. row ********** 3.            [...] 3.            [...] 4.           key: ab 4.           key: ba 5.       key_len: 8 5.       key_len: 8 6.            [...] 6.            [...] 7.          rows: 10 7.          rows: 10 Exec time: 0.00s Exec time: 0.00s Same perf. for this query Other queries will guide us 18 to choose between them
  • 19. Rule #2: Sort Q2: SELECT * FROM t WHERE a = 10 ORDER BY b  Remember: indexed values are sorted  An index can avoid costly filesorts  Think of filesorts performed on on-disk temp tables  ORDER BY clause must be a leftmost prefix of the index  Caveat: an index scan is fast in itself, but retrieving the rows in index order may be slow  Seq. scan on index but random access on table 19
  • 20. Rule #2: Sort  Let's try key(b) for Q2 vs full table scan 1. mysql> EXPLAIN SELECT * ... 1. mysql> EXPLAIN SELECT * ... 2. ********** 1. row ********** 2. ********** 1. row ********** 3.            […] 3.            […] 4.          type: index 4.          type: ALL 5.           key: b 4.           key: NULL 6.       key_len: 4 5.       key_len: NULL 7.            [...] 6.            [...] 8.          rows: 1000638 7.          rows: 1000638 9.         Extra: Using where 8.         Extra: Using where;                       Using filesort Exec time: 1.52s Exec time: 0.37s EXPLAIN suggest key(b) is better, 20 but it's wrong!
  • 21. Rule #2: Sort  An index is not always the best for sorting  If possible, try to sort and filter  Exception to the leftmost prefix rule:  Leading columns appearing in the WHERE clause as constants can fill the holes in the index  WHERE a = 10 ORDER BY b: key(a,b) can filter and sort  Not true with WHERE a > 10 ORDER BY b 21
  • 22. Rule #2: Sort  With key(a,b) 1. mysql> EXPLAIN SELECT * FROM t 1. mysql> EXPLAIN SELECT * FROM t WHERE a = 10 ORDER BY bG WHERE a > 10 ORDER BY bG 2. ********** 1. row ********** 2. ********** 1. row ********** 3.            [...] 3.            [...] 4.          type: ref 4.          type: ALL 5.           key: ab 5.           key: NULL 6.       key_len: 8 6.       key_len: NULL 7.            [...] 7.            [...] 8.          rows: 20 8.          rows: 1000638 9.         Extra: 9.         Extra: Using where;                       Using filesort Could have been a range scan Depends on the distribution of the values 22
  • 23. Rule #3: Cover Q3: SELECT a,b FROM t WHERE a > 100;  With key(a), you filter efficiently  But with key(a,b)  You filter  The index holds all the columns you need  Means you don't need to access data  key(a,b) is a covering index 23
  • 24. Rule #3: Cover  Back to InnoDB's clustered index  It is always covering  SELECT by PK is the fastest access with InnoDB  Take care of your PKs!!  Remember full table scan + filesort vs index?  If the index used for sorting is also covering, it will outperform the table scan 24
  • 25. Rating an index  An index can give you 3 benefits: filtering, sorting, covering  1-star index: 1 property  2-star index: 2 properties  3-star index: 3 properties  This is my own rating, other systems exist 25
  • 26. Range queries and ORDER BY Q4: SELECT * FROM t WHERE a > 10 and b = 20 ORDER BY a mysql> EXPLAIN SELECT * ...G ********** 1. row **********            [...] Key filters and sorts, but          type: range filtering is not efficient.           key: a Getting data is very slow          rows: 500319 (random access + I/O-bound)         Extra: Using where Exec time: 35.9s mysql> EXPLAIN SELECT * ...G ********** 1. row **********            [...]          type: ref Key filters but doesn't sort. possible_keys: a,b,ab,ba Filtering is efficient so getting data,           key: ba post-filtering and post-sorting          rows: 64814 is not too slow         Extra: Using where; 26                Using filesort Exec time: 0.2s
  • 27. Joins and ORDER BY  All columns in the ORDER BY clause must refer to the 1st table  I mean the 1st table to be joined, not the 1st table written in your query!!  Forcing the join order with SELECT STRAIGHT_JOIN is sometimes useful  Sometimes you can't fulfill this condition  This can be a good reason to denormalize 27
  • 28. Tools and methods to improve index usage 28
  • 29. Userstats v2  You need Percona Server or MariaDB 5.2+ mysql> SELECT s.table_name,s.index_name,rows_read        FROM information_schema.statistics s        LEFT JOIN information_schema.index_statistics i        ON (i.table_schema=s.table_schema Table added by            AND i.table_name=s.table_name this feature            AND i.index_name=s.index_name)        WHERE s.table_name='comment'              AND s.table_schema='mydb'              AND seq_in_index=1; Deals with composite indexes +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | table_name | index_name           | rows_read | +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | comment    | PRIMARY              |     50361 | | comment    | user_id              |      NULL | Useless | comment    | video_comment_idx    |  18276197 | | comment    | created_language_idx |      NULL | Useless +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ 29
  • 30. Workload matters!  OLAP server +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | table_name | index_name           | rows_read | +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | comment    | PRIMARY              |     50361 | | comment    | user_id              |      NULL | Never used | comment    | video_comment_idx    |  18276197 | | comment    | created_language_idx |      NULL | +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+  OLTP server +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | table_name | index_name           | rows_read | +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | comment    | PRIMARY              |  12220798 | | comment    | user_id              |  96674982 | Useful! | comment    | video_comment_idx    | 365691254 | | comment    | created_language_idx |    217176 | +­­­­­­­­­­­­+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ 30
  • 31. Pros  Very easy to use  Turn on the variable and forget  Easy to write queries to discover unused indexes automatically 31
  • 32. Cons  Large sample period needed for accurate stats  Not always obvious to say if index is useful  Look at created_language_idx in previous slide  Has some CPU overhead 32
  • 33. pt-duplicate-key-checker  Anything wrong with the keys? CREATE TABLE comment (   comment_id int(10) ... AUTO_INCREMENT,   video_id int(10) ...,   user_id int(10) ...,   language char(2) ...,   [...]   PRIMARY KEY (comment_id), Tool is aware   KEY user_id (user_id), of InnoDB's   KEY video_comment_idx (video_id,language,comment_id) clustered index! ) ENGINE=InnoDB; $ pt­duplicate­key­checker u=root,h=localhost [...] # Key video_comment_idx ends with a prefix of the clustered index # Key definitions: #   KEY video_comment_idx (video_id,language,comment_id) Query to remove #   PRIMARY KEY (comment_id), [...] the index # To shorten this duplicate clustered index, execute: ALTER TABLE mydb.comment DROP INDEX video_comment_idx, ADD INDEX  33 video_comment_idx (video_id,language)
  • 34. pt-index-usage  Helps answer questions not solved by userstats  Are there any queries with a changing exec plan?  Is an index necessary for a query?  Read a slow log file/general log file  Can give you invaluable information on your index usage  See the man page for more 34
  • 35. Thanks for your attention!  Any questions? 35