SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
Explain that “Explain”
The Road to Understanding
“You Should Not Fight with the Database, the Database is your Friend”
put together by Fabrizio Parrella
PARTS ARE QUOTED OR COPIED OR REF FROM:
http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/
http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
get to know your friend
➲ Recognize the strengths and also the weaknesses of
your database
➲ No database is perfect -- deal with it, you're not
perfect either
➲ Think of both big things and small things
BIG: Architecture, surrounding servers, caching
SMALL: SQL coding, join rewrites, server config
becoming friends
➲ Understand storage engine abilities and weaknesses
➲ Understand how the query cache and important
buffers works
➲ Understand optimizer's limitations
➲ Understand what should and should not be done at
the application level
➲ If you understand the above, you'll start to see the
database as a friend and not an enemy
the schema
➲ Basic foundation of performance
➲ Everything else depends on it
➲ Choose your data types wisely
➲ “Divide et Impera” the schema through partitioning
A divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub-
problems of the same (or related) type, until there become simple enough to be solved directly. The
solution to the sub-problems is then combined to give a solution to the original problem.
http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
size does matter !!
smaller, smaller, SMALLER
The more records you can fit into a single page of
memory/disk, the faster your seeks and scans will be
➲ Do you really need that BIGINT?
➲ Use INT UNSIGNED for IPv4 addresses
➲ Use VARCHAR carefully
Converted to CHAR when used in a temporary table
➲ Use TEXT sparingly
Consider separate tables
➲ Use BLOBs very sparingly
Use the filesystem for what it was intended
real life example... handling IPv4 addresses
CREATE TABLE Sessions (
session_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)...
session_data TEXT NOT NULL,
PRIMARY KEY (session_id),
INDEX (ip_address)
) ENGINE=InnoDB;
// Insert a new dummy record
INSERT INTO Sessions VALUES
(NULL, INET_ATON('192.168.0.2'), 'some session data');
SELECT
session_id,
ip_address as ip_raw,
INET_NTOA(ip_address) as ip,
session_data
FROM Sessions
WHERE
ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255');
+------------+------------+-------------+-------------------+
| session_id | ip_raw | ip | session_data |
+------------+------------+-------------+-------------------+
| 1 | 3232235522 | 192.168.0.2 | some session data |
+------------+------------+-------------+-------------------+
SETs and ENUMs
➲ Often sign of poor schema design
➲ Changing the definition will most likely require a full
rebuild of the table
➲ Search functions like FIND_IN_SET() are inefficient
compared to index operation on a join
normalization, taking it too far
DateDate ?
http://thedailywtf.com/forums/thread/75982.aspx
vertical partitioning
➲ Never mix frequently and infrequently
accessed fields in a single table
➲ Splitting tables allows main records to consume the buffer
pages without the extra data taking up space in memory
➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)?
CREATE TABLE Users (
user_id INT NOT NULL AUTO_INCREMENT,
email VARCHAR(80) NOT NULL,
display_name VARCHAR(50) NOT NULL,
password CHAR(41) NOT NULL,
first_name VARCHAR(25) NOT NULL,
last_name VARCHAR(25) NOT NULL,
address VARCHAR(80) NOT NULL,
city VARCHAR(30) NOT NULL,
province CHAR(2) NOT NULL,
postcode CHAR(7) NOT NULL,
interests TEXT NULL,
bio TEXT NULL,
signature TEXT NULL,
skills TEXT NULL,
PRIMARY KEY (user_id),
UNIQUE INDEX (email)
) ENGINE=InnoDB;
CREATE TABLE Users (
user_id INT NOT NULL AUTO_INCREMENT,
email VARCHAR(80) NOT NULL,
display_name VARCHAR(50) NOT NULL,
password CHAR(41) NOT NULL,
PRIMARY KEY (user_id),
UNIQUE INDEX (email)
) ENGINE=InnoDB;
CREATE TABLE UserExtra (
user_id INT NOT NULL
first_name VARCHAR(25) NOT NULL
last_name VARCHAR(25) NOT NULL
address VARCHAR(80) NOT NULL
city VARCHAR(30) NOT NULL
province CHAR(2) NOT NULL
postcode CHAR(7) NOT NULL
interests TEXT NULL
bio TEXT NULL
signature TEXT NULL
skills TEXT NULL
PRIMARY KEY (user_id)
FULLTEXT KEY (interests, skills)
) ENGINE=MyISAM;
understand MySQL query cache
➲ You must understand your application's read/write
patterns
➲ Internal query cache design is a compromise between
CPU usage and read performance
➲ Stores the MYSQL_RESULT of a SELECT along with a hash
of the SELECT SQL statement
➲ Any modification to any table involved in the SELECT
invalidates the stored result
➲ Write applications to be aware of the query cache
Use SELECT SQL_NO_CACHE
coding like a master
➲ Be consistent (for crying out loud)
➲ Use ANSI SQL coding style (vs. Theta)
➲ Stop thinking in terms of iterators, for loops, while
loops, etc
➲ Instead, think in terms of sets
➲ Break complex SQL statements (or business requests)
into smaller, manageable chunks
Consistency, consistency, CONSISTENCY !!
➲ Tabs and Spacing
➲ Upper and Lower Case
➲ Keywords, function names
Nothing pisses offthe query master likeinconsistent SQL code!
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM actor a
INNER JOIN film f ON a.actor_id = f.actor_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10;
vs.
select first_name, a.last_name,
count(*) AS num_rentals
FROM actor a join film on a.actor_id = film.actor_id
group by a.actor_id order by
num_rentals DESC, a.last_name, a.first_name
LIMIT 10;
➲ Aliases
➲ Consider your
teammates
➲ Like your code, SQL is
meant to be read, not
written
guidelines
➲ Beware of join hints
“force index” can get “out of date”
➲ Just because it can be done in a single SQL
statement doesn't meat it should
➲ ALWAYS test and benchmark your solution
ANSI vs. THETA
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM actor a
INNER JOIN film_actor fa ON a.actor_id = fa.actor_id
INNER JOIN film f ON fa.film_id = f.film_id
INNER JOIN inventory I ON f.film_id = i.film_id
INNER JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10; SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM
actor a,
film f,
film_actor fa,
inventory i,
rental r
WHERE
a.actor_id = fa.actor_id
AND fa.film_id = f.film_id
AND f.film_id = i.film_id
AND r.inventory_id = i.inventory_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10;
ANSI STYLE
Explicitly declare JOIN conditions
using the ON clause
THETA STYLE
Implicitly declare JOIN conditions
in the WHERE clause
why ANSI style kicks THETA style's A55
➲ MySQL THETA style only supports INNER and CROSS
join
But MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT,
and NATURAL joins
Mixing and matching both styles can lead to hard-to-read
SQL code
➲ It is extremely easy to miss a join condition with
THETA style
Especially when joining many tables
Forgetting a Join will produce a cartesian product (NOT
GOOD !!!)
WITHOUT THE STRENGHT OF
EXPLAIN
YOU WILL GET LOST IN THE FIELDS
OF MISUNDERSTANDING
how to test our SQL
EXPLAIN the basics
➲ Provides the execution plan chosen by the MySQL
optimizer
➲ Simply prepend the word EXPLAIN in front of your
SELECT statement
➲ Each row represent a set of information for each
table used in the SELECT
EXPLAIN the columns
➲ select_type - type of “set” the data in this row
contains (SIMPLE, DERIVATE, SUBQUERY, etc..)
➲ table - alias (or full table name if no alias) of the table
or derived table from which the data in this set
comes
➲ type - “access strategy” used to grab the data in this
set (ALL, CONST, REF, etc...)
➲ possible_keys - keys available to optimizer for query
➲ keys - keys chosen by the optimizer
➲ key_len – number of bytes used from the keys
➲ ref - shows the column used in join relations
➲ rows - estimate of the number of rows in this set
➲ Extra - information the optimizer chooses to give you
EXPLAIN the output
EXPLAIN
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM film f
INNER JOIN film_category fc ON f.film_id = fc.film_id
INNER JOIN category c ON fc.category_id = c.category_id
WHERE f.title LIKE 'T%'G
*************************** 1. row ***************************
select_type: SIMPLE
table: c
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 16
Extra:
*************************** 2. row ***************************
select_type: SIMPLE
table: fc
type: ref
possible_keys: PRIMARY, fk_film_category_category
key: fk_film_category_category
key_len: 1
ref: c.category_id
rows: 1
Extra: using index
*************************** 2. row ***************************
select_type: SIMPLE
table: f
type: eq_ref
possible_keys: PRIMARY, idx_title
key: PRIMARY
key_len: 2
ref: fc.film_id
rows: 1
Extra: using where
estimate row count
available indexes and
the chosen one
a covering index
was used
EXPLAIN a real world example
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
➲ The three most important columns returned by EXPLAIN
possible_keys
All possible indexes which MYSQL could have used
Based on a series of very quick lookups and calculations
key: chosen key
rows: estimate of the scanned rows
EXPLAIN a real world example
➲ Interpreting the result:
No suitable indexes for this query
MySQL has to do a full scan of the table
Full table scans are almost always the slowest
Full table scans are usually an indication that an index is
needed
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
EXPLAIN a real world example
➲ MySQL has two indexes to choose from
➲ “reg” is not “sufficently unique”
the spread of the values can also be a factor (e.g. when 99% of
rows contain the same value)
➲ Index “uniqueness” is called cardinality
➲ There is space for performance increase
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: conf, reg
key: conf
rows: 331
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX conf (conference_id),
ADD INDEX reg (registration_status);
EXPLAIN a real world example
➲ “reg_conf_index” is a much better choice
➲ Other keys are still available, just not as effective
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: reg, conf, reg_conf_index
key: reg_conf_index
rows: 204
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX reg_conf_index (registration_status, conference_id);
EXPLAIN a real world example
➲ Seems like that also without the “reg” index everything is
working just as expected
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
registration_status = 2
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: reg_conf_index
key: reg_conf_index
rows: 372
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
DELETE INDEX reg,
DELETE INDEX conf;
EXPLAIN a real world example
➲ Without the “conf” index we are at square one
➲ The orders in which the fields are defined in a composite index
affects whether is available in a query
➲ Potential workaround
SELECT * FROM attendees WHERE conference_id = 123 AND
registration_id > 0;
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14502
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
DELETE INDEX reg,
DELETE INDEX conf;
EXPLAIN a real world example
➲ Great, MySQL it is using the index on “lastname”, which is good
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “parr%”
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: lastname
key: lastname
rows: 234
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX lastname (lastname);
EXPLAIN a real world example
➲ MySQL doesn't even try to use an index !
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “%arr%”
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX lastname (lastname);
EXPLAIN a real world example (pre MySQL 5.1)
➲ MySQL doesn't use an index because of the OR
➲ MySQL perform a full table scan
➲ Workaround, use “UNION”
➲ Workaround, add a composite INDEX
ALTER TABLE conference
ADD INDEX location_topic (location_id, topic_id);
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM conferences
WHERE
location_id = 2
OR topic_id IN (4,6,1)
//Let's only show the important parts for now
*************************** 1. row ***************************
table: conferences
possible_keys: location_id, topic_id
key: NULL
rows: 5043
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE conferences
ADD INDEX location_id (location_id)
ADD INDEX topic_id (topic_id);
EXPLAIN a real world example
➲ Looks like we need an index on “conference_id” on attendees
➲ How many total ROWS are estimate ?
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM conferences c
INNER JOIN attendees a USING (conference_id)
WHERE
c.location_id = 2
AND c.topic_id IN (4,6,1)
AND a.registration_status > 1
//Let's only show the important parts for now
*************************** 1. row ***************************
table: c
possible_keys: conference_topic
key: conference_topic
rows: 15
*************************** 1. row ***************************
table: a
possible_keys: NULL
key: NULL
rows: 14502
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
15 x 14502
EXPLAIN the “type”
EXPLAIN the type
➲ CONST: SELECT * FROM table WHERE field = “value”;
 The field needs to be indexed with a unique non-nullable key
 If non-unique or nullable the type will be “ref”
 It refers to when a table with a single row is referenced in the SELECT
 Can be propagate across multiple joined columns:
EXPLAIN
SELECT r.*
FROM rental r
INNER JOIN customer c ON r.customer_id = c.customer_id
WHERE r.rental_id = 13G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: r
type: const
possible_keys: PRIMARY,idx_fk_customer_id
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: c
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: const /* Here is where the propagation occurs...*/
rows: 1
Extra:
2 rows in set (0.00 sec)
EXPLAIN the type
➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND
“value”;
 The field needs to be indexed
 It too many records are estimated, it won't be used
EPLAIN
SELECT *
FROM rental
WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: range
possible_keys: rental_date
key: rental_date
key_len: 8
ref: NULL
rows: 364
Extra: Using where
1 row in set (0.00 sec)
EXPLAIN the type
➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND
“far away from starting value”;
 No WHERE condition (duh)
 No index on the field in the WHERE condition
 Poor selectivity on the indexed field
 Too many records meet the WHERE condition
 SEEK: jumps into random places to fetch the data and repeat for each
piece of data needed
 SCAN: jump to the start and sequentially read the data
 For large amount of data, SCAN operations tends to be more efficient than
multiple SEEK operations
 Using SELECT * FROM
EPLAIN
SELECT *
FROM rental
WHERE rental_date BETWEEN '2001-01-14' AND '2012-12-31'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: ALL
possible_keys: rental_date /* large range force full scan */
key: NULL
key_len: NULL
ref: NULL
rows: 16298
Extra: Using where
1 row in set (0.00 sec)
EXPLAIN the type
➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value”
AND field1 = “value”;
 Introduced with the optimizer on MySQL 5.0
 Allows the optimizer to use more than one index to satisfy a join condition
 Prior to MySQL 5.0, only one index
 In case of OR conditions, MySQL < 5.0 would use a full table scan
EXPLAIN
SELECT *
FROM rental
WHERE
rental_id IN (10,11,12)
OR rental_date = '2006-02-01' G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index_merge
possible_keys: PRIMARY,rental_date
key: rental_date,PRIMARY
key_len: 8,4
ref: NULL
rows: 4
Extra: Using sort_union(rental_date,PRIMARY); Using where
1 row in set (0.02 sec)
EXPLAIN the “Extra”
EXPLAIN the Extra
➲ “Extra” shows additional operations invoked to get your result set
➲ Some common values are (more are discussed in the MySQL
manual):
 Using where
 Using temporary table
 Using filesort
 Using index
EXPLAIN
SELECT *
FROM rental
WHERE
rental_id IN (10,11,12)
OR rental_date = '2006-02-01' G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index_merge
possible_keys: PRIMARY,rental_date
key: rental_date,PRIMARY
key_len: 8,4
ref: NULL
rows: 4
Extra: Using sort_union(rental_date,PRIMARY); Using where
1 row in set (0.02 sec)
EXPLAIN the Extra
➲ Using filesort: AVOID
➲ Avoid because
 Doesn't Use Index
 Involves a full scan
 Uses a generic algorithm (one fits all)
 Uses filesystem (BAD !!)
 Gets slower with more data
➲ It's not all that bad
 Sometime unavoidable - ORDER BY RAND()
 Acceptable provided you get to your result as quickly as possible, and
keep it predictably small
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
ORDER BY lastname
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id
key: conference_id
rows: 331
Extra: Using filesort
EXPLAIN the Extra
➲ Using index: GOOD
➲ Celebrate because
 MySQL got your results just by consulting the index
 MySQL didn't need to look at the table to get the results (open table is
expensive)
 Fastest way to get your data
➲ Particularly useful...
 When you are interested in a single data or id
 When you are interested in COUNT(), SUM(), AVG(), etc. of a field
EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id
key: conference_id
rows: 331
Extra:
ALTER TABLE attendees ADD INDEX conf_age (conference_id, age);
EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id, conf_surname
key: conf_surname
rows: 331
Extra: Using index
Nothing is actually wrong with this query, it just could be quicker
Outside from caching, this is the fastest way to get your data
INDEXES... your schema's phone book
➲ Speed up SELECTs, but slow down modifications
➲ Make sure you have indexes on columns used in
WHERE, ON, and GROUP BY clauses
➲ Always ensure that JOIN conditions are indexed AND
have identical data types
➲ Good keys:
Selectivity:
% of distinct values
= distinct values / number rows
unique or primary always 1
Low selectivity:
Maybe you can put it in a multi-column index
Prefix ? Suffix ? It depends on your application
indexed columns and functions don't mix
A full table scan is used because a function (LEFT) is operating on
the lastname column.
Let's Fix this...
EXPLAIN
SELECT *
FROM attendees
WHERE
LEFT(lastname.2) = “Pa”
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra: Using where
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “Pa%”
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: range
possible_keys: idx_title
key: idx_title
key_len: 767
ref: NULL
rows: 15
Extra: Using where
let's fix multiple issues with a SELECT query
First, we are operating on an index column (order_created) with a function – let's fix that:
SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7;
Even if we removed the function in the WHERE expression, we still have a non-
deterministic function in the statement which eliminates this query from being places in
the query cache – let's fix that:
SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS;
We replaced the function with a constant, however we are specifying a SELECT * instead
than the actual fields that we need.
What is there is a TEXT field in the table that we don't seen to see ? Having it included in
the result means a larger result set which may not fit in the query cache and may force a
disk-based temporary table – let's fix that:
SELECT * FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
SELECT
order_id,
customer_id,
order_total,
date_created
FROM orders
WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
good indexes vs. bad indexes
Don't forget that MySQL string indexes allow only 1000 characters (333
using UTF-8).
Let's say you have 11,000,000 records in a table called “USERS” with
the following fields:
➲ user, firstname, lastname, gender, email, age, country_id
Our application perform searched on the following fields:
➲ user
➲ firstname, lastname, gender
➲ email
It is obvious to create indexes on user and email, especially if they are
unique, but what about the other fields?
➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0.
Best would be to remove the index on gender if you have it
➲ “firstname”/”lastname” depend on the uniqueness of the values
stores.
SELECT DISTINCT to calculate the selectivity
if it is above 15% keep it
below 15% you might want to create a composite INDEX
removing crappy or redundant indexes
SELECT
t.TABLE_SCHEMA AS `db`,
t.TABLE_NAME AS `table`,
s.INDEX_NAME AS `index name`,
s.COLUMN_NAME AS `field name`,
s.SEQ_IN_INDEX `seq in index`,
s2.max_columns AS `# cols,
s.CARDINALITY AS `card`,
t.TABLE_ROWS AS `est rows`,
ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`
FROM
INFORMATION_SCHEMA.STATISTICS s
INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME
INNER JOIN (
SELECT
TABLE_SCHEMA,
TABLE_NAME,
INDEX_NAME,
MAX(SEQ_IN_INDEX) AS max_columns
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA != 'mysql'
GROUP BY
TABLE_SCHEMA,
TABLE_NAME,
INDEX_NAME
) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME
WHERE
t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */
AND t.TABLE_ROWS > 10 /* Only tables with some rows */
AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */
AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */
ORDER BY
`sel %`, /* DESC for best non-unique indexes */
s.TABLE_SCHEMA,
s.TABLE_NAME
LIMIT 100

Más contenido relacionado

La actualidad más candente

Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
 
[Webinar] Performance e otimização de banco de dados MySQL
[Webinar] Performance e otimização de banco de dados MySQL[Webinar] Performance e otimização de banco de dados MySQL
[Webinar] Performance e otimização de banco de dados MySQLKingHost - Hospedagem de sites
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimizationBaohua Cai
 
How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0Norvald Ryeng
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1Federico Campoli
 
PostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaPostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaEdureka!
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsFrederic Descamps
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 
MariaDB 마이그레이션 - 네오클로바
MariaDB 마이그레이션 - 네오클로바MariaDB 마이그레이션 - 네오클로바
MariaDB 마이그레이션 - 네오클로바NeoClova
 
MySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsMySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsTuyen Vuong
 
Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningOSSCube
 
Introduction to structured query language (sql)
Introduction to structured query language (sql)Introduction to structured query language (sql)
Introduction to structured query language (sql)Sabana Maharjan
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space ManagementMIJIN AN
 

La actualidad más candente (20)

Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
[Webinar] Performance e otimização de banco de dados MySQL
[Webinar] Performance e otimização de banco de dados MySQL[Webinar] Performance e otimização de banco de dados MySQL
[Webinar] Performance e otimização de banco de dados MySQL
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimization
 
Views, Triggers, Functions, Stored Procedures, Indexing and Joins
Views, Triggers, Functions, Stored Procedures,  Indexing and JoinsViews, Triggers, Functions, Stored Procedures,  Indexing and Joins
Views, Triggers, Functions, Stored Procedures, Indexing and Joins
 
How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
PostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaPostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | Edureka
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and Histograms
 
Stored procedure
Stored procedureStored procedure
Stored procedure
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
MariaDB 마이그레이션 - 네오클로바
MariaDB 마이그레이션 - 네오클로바MariaDB 마이그레이션 - 네오클로바
MariaDB 마이그레이션 - 네오클로바
 
MySQL Backup & Recovery
MySQL Backup & RecoveryMySQL Backup & Recovery
MySQL Backup & Recovery
 
MySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsMySQL Atchitecture and Concepts
MySQL Atchitecture and Concepts
 
Mysql
MysqlMysql
Mysql
 
Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuning
 
Introduction to structured query language (sql)
Introduction to structured query language (sql)Introduction to structured query language (sql)
Introduction to structured query language (sql)
 
Stored procedure
Stored procedureStored procedure
Stored procedure
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
 

Similar a Explain that explain

15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 
Optimizing MySQL Queries
Optimizing MySQL QueriesOptimizing MySQL Queries
Optimizing MySQL QueriesAchievers Tech
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 
Mysqlppt
MysqlpptMysqlppt
MysqlpptReka
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the ServerdevObjective
 
Sydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plansSydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution planspaulguerin
 
ABAP Programming Overview
ABAP Programming OverviewABAP Programming Overview
ABAP Programming Overviewsapdocs. info
 
Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01tabish
 
chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01tabish
 
Chapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewChapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewAshish Kumar
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02tabish
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02wingsrai
 
PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPatrick Allaert
 
MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling PresentationTommy Falgout
 
Sql basics
Sql basicsSql basics
Sql basicsKumar
 

Similar a Explain that explain (20)

15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Optimizing MySQL Queries
Optimizing MySQL QueriesOptimizing MySQL Queries
Optimizing MySQL Queries
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
Mysqlppt
MysqlpptMysqlppt
Mysqlppt
 
Optimizando MySQL
Optimizando MySQLOptimizando MySQL
Optimizando MySQL
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
 
Sql killedserver
Sql killedserverSql killedserver
Sql killedserver
 
Sydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plansSydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plans
 
Sql for dbaspresentation
Sql for dbaspresentationSql for dbaspresentation
Sql for dbaspresentation
 
ABAP Programming Overview
ABAP Programming OverviewABAP Programming Overview
ABAP Programming Overview
 
Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01
 
chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01
 
Chapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewChapter 1 Abap Programming Overview
Chapter 1 Abap Programming Overview
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02
 
PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & Pinba
 
MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling Presentation
 
Sql basics
Sql basicsSql basics
Sql basics
 

Último

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 

Último (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 

Explain that explain

  • 1. Explain that “Explain” The Road to Understanding “You Should Not Fight with the Database, the Database is your Friend” put together by Fabrizio Parrella PARTS ARE QUOTED OR COPIED OR REF FROM: http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/ http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
  • 2. get to know your friend ➲ Recognize the strengths and also the weaknesses of your database ➲ No database is perfect -- deal with it, you're not perfect either ➲ Think of both big things and small things BIG: Architecture, surrounding servers, caching SMALL: SQL coding, join rewrites, server config
  • 3. becoming friends ➲ Understand storage engine abilities and weaknesses ➲ Understand how the query cache and important buffers works ➲ Understand optimizer's limitations ➲ Understand what should and should not be done at the application level ➲ If you understand the above, you'll start to see the database as a friend and not an enemy
  • 4. the schema ➲ Basic foundation of performance ➲ Everything else depends on it ➲ Choose your data types wisely ➲ “Divide et Impera” the schema through partitioning A divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub- problems of the same (or related) type, until there become simple enough to be solved directly. The solution to the sub-problems is then combined to give a solution to the original problem. http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
  • 5. size does matter !! smaller, smaller, SMALLER The more records you can fit into a single page of memory/disk, the faster your seeks and scans will be ➲ Do you really need that BIGINT? ➲ Use INT UNSIGNED for IPv4 addresses ➲ Use VARCHAR carefully Converted to CHAR when used in a temporary table ➲ Use TEXT sparingly Consider separate tables ➲ Use BLOBs very sparingly Use the filesystem for what it was intended
  • 6. real life example... handling IPv4 addresses CREATE TABLE Sessions ( session_id INT UNSIGNED NOT NULL AUTO_INCREMENT, ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)... session_data TEXT NOT NULL, PRIMARY KEY (session_id), INDEX (ip_address) ) ENGINE=InnoDB; // Insert a new dummy record INSERT INTO Sessions VALUES (NULL, INET_ATON('192.168.0.2'), 'some session data'); SELECT session_id, ip_address as ip_raw, INET_NTOA(ip_address) as ip, session_data FROM Sessions WHERE ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255'); +------------+------------+-------------+-------------------+ | session_id | ip_raw | ip | session_data | +------------+------------+-------------+-------------------+ | 1 | 3232235522 | 192.168.0.2 | some session data | +------------+------------+-------------+-------------------+
  • 7. SETs and ENUMs ➲ Often sign of poor schema design ➲ Changing the definition will most likely require a full rebuild of the table ➲ Search functions like FIND_IN_SET() are inefficient compared to index operation on a join
  • 8. normalization, taking it too far DateDate ? http://thedailywtf.com/forums/thread/75982.aspx
  • 9. vertical partitioning ➲ Never mix frequently and infrequently accessed fields in a single table ➲ Splitting tables allows main records to consume the buffer pages without the extra data taking up space in memory ➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)? CREATE TABLE Users ( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, first_name VARCHAR(25) NOT NULL, last_name VARCHAR(25) NOT NULL, address VARCHAR(80) NOT NULL, city VARCHAR(30) NOT NULL, province CHAR(2) NOT NULL, postcode CHAR(7) NOT NULL, interests TEXT NULL, bio TEXT NULL, signature TEXT NULL, skills TEXT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email) ) ENGINE=InnoDB; CREATE TABLE Users ( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email) ) ENGINE=InnoDB; CREATE TABLE UserExtra ( user_id INT NOT NULL first_name VARCHAR(25) NOT NULL last_name VARCHAR(25) NOT NULL address VARCHAR(80) NOT NULL city VARCHAR(30) NOT NULL province CHAR(2) NOT NULL postcode CHAR(7) NOT NULL interests TEXT NULL bio TEXT NULL signature TEXT NULL skills TEXT NULL PRIMARY KEY (user_id) FULLTEXT KEY (interests, skills) ) ENGINE=MyISAM;
  • 10. understand MySQL query cache ➲ You must understand your application's read/write patterns ➲ Internal query cache design is a compromise between CPU usage and read performance ➲ Stores the MYSQL_RESULT of a SELECT along with a hash of the SELECT SQL statement ➲ Any modification to any table involved in the SELECT invalidates the stored result ➲ Write applications to be aware of the query cache Use SELECT SQL_NO_CACHE
  • 11. coding like a master ➲ Be consistent (for crying out loud) ➲ Use ANSI SQL coding style (vs. Theta) ➲ Stop thinking in terms of iterators, for loops, while loops, etc ➲ Instead, think in terms of sets ➲ Break complex SQL statements (or business requests) into smaller, manageable chunks
  • 12. Consistency, consistency, CONSISTENCY !! ➲ Tabs and Spacing ➲ Upper and Lower Case ➲ Keywords, function names Nothing pisses offthe query master likeinconsistent SQL code! SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a INNER JOIN film f ON a.actor_id = f.actor_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; vs. select first_name, a.last_name, count(*) AS num_rentals FROM actor a join film on a.actor_id = film.actor_id group by a.actor_id order by num_rentals DESC, a.last_name, a.first_name LIMIT 10; ➲ Aliases ➲ Consider your teammates ➲ Like your code, SQL is meant to be read, not written
  • 13. guidelines ➲ Beware of join hints “force index” can get “out of date” ➲ Just because it can be done in a single SQL statement doesn't meat it should ➲ ALWAYS test and benchmark your solution
  • 14. ANSI vs. THETA SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a INNER JOIN film_actor fa ON a.actor_id = fa.actor_id INNER JOIN film f ON fa.film_id = f.film_id INNER JOIN inventory I ON f.film_id = i.film_id INNER JOIN rental r ON r.inventory_id = i.inventory_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a, film f, film_actor fa, inventory i, rental r WHERE a.actor_id = fa.actor_id AND fa.film_id = f.film_id AND f.film_id = i.film_id AND r.inventory_id = i.inventory_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; ANSI STYLE Explicitly declare JOIN conditions using the ON clause THETA STYLE Implicitly declare JOIN conditions in the WHERE clause
  • 15. why ANSI style kicks THETA style's A55 ➲ MySQL THETA style only supports INNER and CROSS join But MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT, and NATURAL joins Mixing and matching both styles can lead to hard-to-read SQL code ➲ It is extremely easy to miss a join condition with THETA style Especially when joining many tables Forgetting a Join will produce a cartesian product (NOT GOOD !!!)
  • 16. WITHOUT THE STRENGHT OF EXPLAIN YOU WILL GET LOST IN THE FIELDS OF MISUNDERSTANDING how to test our SQL
  • 17. EXPLAIN the basics ➲ Provides the execution plan chosen by the MySQL optimizer ➲ Simply prepend the word EXPLAIN in front of your SELECT statement ➲ Each row represent a set of information for each table used in the SELECT
  • 18. EXPLAIN the columns ➲ select_type - type of “set” the data in this row contains (SIMPLE, DERIVATE, SUBQUERY, etc..) ➲ table - alias (or full table name if no alias) of the table or derived table from which the data in this set comes ➲ type - “access strategy” used to grab the data in this set (ALL, CONST, REF, etc...) ➲ possible_keys - keys available to optimizer for query ➲ keys - keys chosen by the optimizer ➲ key_len – number of bytes used from the keys ➲ ref - shows the column used in join relations ➲ rows - estimate of the number of rows in this set ➲ Extra - information the optimizer chooses to give you
  • 19. EXPLAIN the output EXPLAIN SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM film f INNER JOIN film_category fc ON f.film_id = fc.film_id INNER JOIN category c ON fc.category_id = c.category_id WHERE f.title LIKE 'T%'G *************************** 1. row *************************** select_type: SIMPLE table: c type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 16 Extra: *************************** 2. row *************************** select_type: SIMPLE table: fc type: ref possible_keys: PRIMARY, fk_film_category_category key: fk_film_category_category key_len: 1 ref: c.category_id rows: 1 Extra: using index *************************** 2. row *************************** select_type: SIMPLE table: f type: eq_ref possible_keys: PRIMARY, idx_title key: PRIMARY key_len: 2 ref: fc.film_id rows: 1 Extra: using where estimate row count available indexes and the chosen one a covering index was used
  • 20. EXPLAIN a real world example CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ➲ The three most important columns returned by EXPLAIN possible_keys All possible indexes which MYSQL could have used Based on a series of very quick lookups and calculations key: chosen key rows: estimate of the scanned rows
  • 21. EXPLAIN a real world example ➲ Interpreting the result: No suitable indexes for this query MySQL has to do a full scan of the table Full table scans are almost always the slowest Full table scans are usually an indication that an index is needed CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB;
  • 22. EXPLAIN a real world example ➲ MySQL has two indexes to choose from ➲ “reg” is not “sufficently unique” the spread of the values can also be a factor (e.g. when 99% of rows contain the same value) ➲ Index “uniqueness” is called cardinality ➲ There is space for performance increase CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: conf, reg key: conf rows: 331 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX conf (conference_id), ADD INDEX reg (registration_status);
  • 23. EXPLAIN a real world example ➲ “reg_conf_index” is a much better choice ➲ Other keys are still available, just not as effective CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: reg, conf, reg_conf_index key: reg_conf_index rows: 204 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX reg_conf_index (registration_status, conference_id);
  • 24. EXPLAIN a real world example ➲ Seems like that also without the “reg” index everything is working just as expected CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE registration_status = 2 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: reg_conf_index key: reg_conf_index rows: 372 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees DELETE INDEX reg, DELETE INDEX conf;
  • 25. EXPLAIN a real world example ➲ Without the “conf” index we are at square one ➲ The orders in which the fields are defined in a composite index affects whether is available in a query ➲ Potential workaround SELECT * FROM attendees WHERE conference_id = 123 AND registration_id > 0; CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14502 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees DELETE INDEX reg, DELETE INDEX conf;
  • 26. EXPLAIN a real world example ➲ Great, MySQL it is using the index on “lastname”, which is good CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “parr%” //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: lastname key: lastname rows: 234 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX lastname (lastname);
  • 27. EXPLAIN a real world example ➲ MySQL doesn't even try to use an index ! CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “%arr%” //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX lastname (lastname);
  • 28. EXPLAIN a real world example (pre MySQL 5.1) ➲ MySQL doesn't use an index because of the OR ➲ MySQL perform a full table scan ➲ Workaround, use “UNION” ➲ Workaround, add a composite INDEX ALTER TABLE conference ADD INDEX location_topic (location_id, topic_id); CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM conferences WHERE location_id = 2 OR topic_id IN (4,6,1) //Let's only show the important parts for now *************************** 1. row *************************** table: conferences possible_keys: location_id, topic_id key: NULL rows: 5043 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE conferences ADD INDEX location_id (location_id) ADD INDEX topic_id (topic_id);
  • 29. EXPLAIN a real world example ➲ Looks like we need an index on “conference_id” on attendees ➲ How many total ROWS are estimate ? CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM conferences c INNER JOIN attendees a USING (conference_id) WHERE c.location_id = 2 AND c.topic_id IN (4,6,1) AND a.registration_status > 1 //Let's only show the important parts for now *************************** 1. row *************************** table: c possible_keys: conference_topic key: conference_topic rows: 15 *************************** 1. row *************************** table: a possible_keys: NULL key: NULL rows: 14502 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; 15 x 14502
  • 31. EXPLAIN the type ➲ CONST: SELECT * FROM table WHERE field = “value”;  The field needs to be indexed with a unique non-nullable key  If non-unique or nullable the type will be “ref”  It refers to when a table with a single row is referenced in the SELECT  Can be propagate across multiple joined columns: EXPLAIN SELECT r.* FROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id WHERE r.rental_id = 13G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: const possible_keys: PRIMARY,idx_fk_customer_id key: PRIMARY key_len: 4 ref: const rows: 1 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: const /* Here is where the propagation occurs...*/ rows: 1 Extra: 2 rows in set (0.00 sec)
  • 32. EXPLAIN the type ➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND “value”;  The field needs to be indexed  It too many records are estimated, it won't be used EPLAIN SELECT * FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: range possible_keys: rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where 1 row in set (0.00 sec)
  • 33. EXPLAIN the type ➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND “far away from starting value”;  No WHERE condition (duh)  No index on the field in the WHERE condition  Poor selectivity on the indexed field  Too many records meet the WHERE condition  SEEK: jumps into random places to fetch the data and repeat for each piece of data needed  SCAN: jump to the start and sequentially read the data  For large amount of data, SCAN operations tends to be more efficient than multiple SEEK operations  Using SELECT * FROM EPLAIN SELECT * FROM rental WHERE rental_date BETWEEN '2001-01-14' AND '2012-12-31'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ALL possible_keys: rental_date /* large range force full scan */ key: NULL key_len: NULL ref: NULL rows: 16298 Extra: Using where 1 row in set (0.00 sec)
  • 34. EXPLAIN the type ➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value” AND field1 = “value”;  Introduced with the optimizer on MySQL 5.0  Allows the optimizer to use more than one index to satisfy a join condition  Prior to MySQL 5.0, only one index  In case of OR conditions, MySQL < 5.0 would use a full table scan EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12) OR rental_date = '2006-02-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)
  • 36. EXPLAIN the Extra ➲ “Extra” shows additional operations invoked to get your result set ➲ Some common values are (more are discussed in the MySQL manual):  Using where  Using temporary table  Using filesort  Using index EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12) OR rental_date = '2006-02-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)
  • 37. EXPLAIN the Extra ➲ Using filesort: AVOID ➲ Avoid because  Doesn't Use Index  Involves a full scan  Uses a generic algorithm (one fits all)  Uses filesystem (BAD !!)  Gets slower with more data ➲ It's not all that bad  Sometime unavoidable - ORDER BY RAND()  Acceptable provided you get to your result as quickly as possible, and keep it predictably small EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 ORDER BY lastname *************************** 1. row *************************** table: attendees possible_keys: conference_id key: conference_id rows: 331 Extra: Using filesort
  • 38. EXPLAIN the Extra ➲ Using index: GOOD ➲ Celebrate because  MySQL got your results just by consulting the index  MySQL didn't need to look at the table to get the results (open table is expensive)  Fastest way to get your data ➲ Particularly useful...  When you are interested in a single data or id  When you are interested in COUNT(), SUM(), AVG(), etc. of a field EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123 *************************** 1. row *************************** table: attendees possible_keys: conference_id key: conference_id rows: 331 Extra: ALTER TABLE attendees ADD INDEX conf_age (conference_id, age); EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123 *************************** 1. row *************************** table: attendees possible_keys: conference_id, conf_surname key: conf_surname rows: 331 Extra: Using index Nothing is actually wrong with this query, it just could be quicker Outside from caching, this is the fastest way to get your data
  • 39. INDEXES... your schema's phone book ➲ Speed up SELECTs, but slow down modifications ➲ Make sure you have indexes on columns used in WHERE, ON, and GROUP BY clauses ➲ Always ensure that JOIN conditions are indexed AND have identical data types ➲ Good keys: Selectivity: % of distinct values = distinct values / number rows unique or primary always 1 Low selectivity: Maybe you can put it in a multi-column index Prefix ? Suffix ? It depends on your application
  • 40. indexed columns and functions don't mix A full table scan is used because a function (LEFT) is operating on the lastname column. Let's Fix this... EXPLAIN SELECT * FROM attendees WHERE LEFT(lastname.2) = “Pa” *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 951 Extra: Using where EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “Pa%” *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: range possible_keys: idx_title key: idx_title key_len: 767 ref: NULL rows: 15 Extra: Using where
  • 41. let's fix multiple issues with a SELECT query First, we are operating on an index column (order_created) with a function – let's fix that: SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7; Even if we removed the function in the WHERE expression, we still have a non- deterministic function in the statement which eliminates this query from being places in the query cache – let's fix that: SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS; We replaced the function with a constant, however we are specifying a SELECT * instead than the actual fields that we need. What is there is a TEXT field in the table that we don't seen to see ? Having it included in the result means a larger result set which may not fit in the query cache and may force a disk-based temporary table – let's fix that: SELECT * FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS; SELECT order_id, customer_id, order_total, date_created FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
  • 42. good indexes vs. bad indexes Don't forget that MySQL string indexes allow only 1000 characters (333 using UTF-8). Let's say you have 11,000,000 records in a table called “USERS” with the following fields: ➲ user, firstname, lastname, gender, email, age, country_id Our application perform searched on the following fields: ➲ user ➲ firstname, lastname, gender ➲ email It is obvious to create indexes on user and email, especially if they are unique, but what about the other fields? ➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0. Best would be to remove the index on gender if you have it ➲ “firstname”/”lastname” depend on the uniqueness of the values stores. SELECT DISTINCT to calculate the selectivity if it is above 15% keep it below 15% you might want to create a composite INDEX
  • 43. removing crappy or redundant indexes SELECT t.TABLE_SCHEMA AS `db`, t.TABLE_NAME AS `table`, s.INDEX_NAME AS `index name`, s.COLUMN_NAME AS `field name`, s.SEQ_IN_INDEX `seq in index`, s2.max_columns AS `# cols, s.CARDINALITY AS `card`, t.TABLE_ROWS AS `est rows`, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %` FROM INFORMATION_SCHEMA.STATISTICS s INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME INNER JOIN ( SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME, MAX(SEQ_IN_INDEX) AS max_columns FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA != 'mysql' GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME ) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME WHERE t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */ AND t.TABLE_ROWS > 10 /* Only tables with some rows */ AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */ AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */ ORDER BY `sel %`, /* DESC for best non-unique indexes */ s.TABLE_SCHEMA, s.TABLE_NAME LIMIT 100