after a brief intro about indexes present in PostgreSQL, the new feature in PostgreSQL 10 are shown: parallel index scans, hash persistence, BRIN autosummarization, unbalanced support (SP-GiST) for inet data.
1. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Indexes
in
PostgreSQL
(10)
2. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The outlineThe outline
• Indexes in PostgreSQL
• What’s new in v10:
– Parallelism
– Hash indexing
– New supports for SP-GiST (inet data)
– Summarization of BRINs
3. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
~$ whoami~$ whoami
Giuseppe BroccoloGiuseppe Broccolo
- data engineer at- data engineer at
- member of- member of
@giubro
gbroccolo7
gbroccolo
gemini__81
g.broccolo.7@gmail.com
5. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
PostgreSQL indexesPostgreSQL indexes
• AKA Access Methods
– allow concurrent changes (MVCC compliant)
– persist the information (WAL)
– speed up access to data:
• links to data blocks (sometimes can be avoided)
• Indexes’ blocks live in shared buffers AWA data blocks
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
WALWALWAL
sharedbuffers
6. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
7. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
balanced
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
• balanced structures speed up punctual searches
8. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
balanced
unbalanced
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
• balanced structures speed up punctual searches
• unbalanced ones are quite faster for range
searches
9. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the hashesThe default AMs – the hashes
• binary maps (k: v)
– k: the hash of the search key - bucket
– v: the address where the key is stored
– just one kind of search: =
– complexity:
• ~O(1)
– like trees, their sizes are comparable with
the indexed dataset
• ~O(N)
search key
k: value...
hashing
N
complexity
~O(logN)
...
~O(1)
10. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the BRINsThe default AMs – the BRINs
• Block Range Indexes:
– À. Herrera, S. Riggs, H. Linnakangas (PG 9.5)
– Range: summarization of adjacent-on-disk blocks
– complexity:
• ~O(N/K), K~10/100
• really small indexes,faster creation
• ~O(N/K’), K’~1000/10000
• can be used for low-selectivity queries
• low performance for “dynamic” data
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
range 0 range 1 range 2 range 3
range 7range 6range 5range 4
Summarization:
blk n. xxxxx
range X blk n. yyyyy
blk n. zzzzz
......
11. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMsThe default AMs
• B-tree, GIN, GiST, SP-GiST, Hash, BRIN
• can add user defined new access methods
– fully supported since 9.6 (thanks to postgrespro & 2ndQuadrant)
• CREATE ACCESS METHOD
sortable generalized
balanced unbalanced
trees
12. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Extend AMs to datatypes: the OpClassesExtend AMs to datatypes: the OpClasses
• access methods use operator classes (opclass)
•
•
•
• define:
– operators for the needed types
– support functions depending on the access method
• can be extended to specific datatypes
CREATE INDEX idx_name
USING method
ON table (column opclass_name)
WITH (opt=value);
• CREATE OPERATOR CLASS opclass_name
FOR TYPE datatype
USING method
OPERATOR $$(),
[...],
FUNCTION func1(),
[...]
13. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Execution plansExecution plans
• IndexScan need to inspect data
pages for row visibility
• IndexOnlyScan just index pages, use
visibility map (PG9.2)
• BitmapIndexScan
BitmapHeapScan 1) reduce # of accesses
using a bitmap
2) used by BRIN to
inspect block ranges
N
complexity
~O(logN)
14. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
What’s new in PG 10 ?What’s new in PG 10 ?
15. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in index scansParallelization in index scans
• parallelization is not new in PG (9.6), see G. Ciolli later
– parallel B-tree index scans
– parallel BitmapHeapScan (different areas of the heap are processed
by parallel workers)
– R. Syed, A. Kapila, R. Haas, R. Sabih, D. Kumar, R. Haas, J. Rouhaud
16. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in Index ScansParallelization in Index Scans
• for B-tree
– Workers inspect leaf pages in parallel
gather
node
gather
node
worker #1
worker #2
worker #N
...
• for bitmap heap scan
– Workers inspect heap chunks in parallel
17. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in Index ScansParallelization in Index Scans
• The parameters:
– max_parallel_workers (included in max_worker_processes)
– max_parallel_workers_per_gather (included in max_parallel_workers)
– min_parallel_index_scan_size (512kB)
• heuristic: # workers / index size > 512kB * 3# workers
– parallel_setup_cost (1000.0)
– parallel_tuple_cost (0.1)
– force_parallel_mode (false)
• tune them basing on underlying HW!
18. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Ex. IndexOnlyScan on B-tree
• table/B-tree ~O(300MB)
=# CREATE TABLE test AS
=# SELECT generate_series(1,10000000) t(i);
CREATE
=# CREATE INDEX btree_idx ON test USING btree (i);
CREATE
19. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When parallelization is disabled:When parallelization is disabled:
• Ex. IndexOnlyScan on B-tree:
=# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5;
QUERY PLAN
----------------------------------------------------------
Index Only Scan using btree_id on test
(cost=0.43..8.45 rows=1 width=4)
(actual time=0.433..0.434 rows=1 loops=1)
Index Cond: (i = 5)
Heap Fetches: 1
Planning time: 0.525 ms
Execution time: 0.461 ms
(5 rows)
20. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Setup parallel executions:
•
•
•
• Plan does not change!! Force parallelization...
=# SET max_parallel_workers TO 8;
SET
=# SET max_parallel_workers_per_gather TO 8; -- up to 6 workers
SET
=# SET force_parallel_mode TO true;
SET
21. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Ex. IndexOnlyScan on B-tree
=# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5;
QUERY PLAN
----------------------------------------------------------
Gather (cost=1000.43..1008.45 rows=1 width=4)
(actual time=2.523..2.579 rows=1 loops=1)
Workers Planned: 6
Workers Launched: 6
Single Copy: true
-> Index Only Scan using btree_id on test
(cost=0.43..8.45 rows=1 width=4)
(actual time=0.030..0.032 rows=1 loops=1)
Index Cond: (i = 5)
Heap Fetches: 0
Planning time: 0.063 ms
Execution time: 3.934 ms
(9 rows)
22. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• try to “trick” the planner with lower tuple costs:
• the same plan is obtained – and it is still disadvantageous!
– costs parameters are (almost) always fine
– parallelization costs are sustainable in case of (real) big data
=# SET force_parallel_mode TO false;
SET
=# SET parallel_tuple_cost TO 0.01;
SET
23. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
8kB8kB8kB8kB
WALWALWAL
• Hash AMs did not define how index changes had to be logged into WALs:
– Hashes lived just in shared buffers – no crash safe!
– Hashes could not be phisically replicated
• Hashes AMs now include WAL logging (R. Haas, G. Ghosh,
A. Kapila,A. Sharma)
24. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
• Ex. physical replication, with pre-existing hash index before 1st
base backup:
hot standby
=# d hash_example
Table "public.hash_example"
Column | Type | Modifiers
--------+---------+-----------
i | integer |
Indexes:
"hash_idx" hash (i)
master
=# d hash_example
Table "public.hash_example"
Column | Type | Modifiers
--------+---------+-----------
i | integer |
Indexes:
"hash_idx" hash (i)
WALWAL WALWALWALWAL
25. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
• pre PostgreSQL 10:
hot standby
=# explain analyze select * from
=# hash_example where i = 123;
QUERY PLAN
-----------------------------------------
Index Scan using hash_idx on hash_example
(cost=0.00..8.02 rows=1 width=21)
(actual time=1.526..1.529 rows=1 loops=1)
[...]
master
=# explain analyze select * from
=# hash_example where i = 123;
ERROR: could not read block 0 in file
"base/16402/458955269": read only 0 of
8192 byte
=# SET enable_index_scan TO false;
SET
WALWAL WALWALWALWAL
26. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Unbalanced indexes perform better in case of inclusion searches:
– Ex. Quad-tree
&&
bbox
• H. Hesegeli extended the use case to IPv4/IPv6 addresses (inet, 7 Bytes/19 Bytes):
– defined the OpClass for inet to be interfaced with SP-GiST AMs
• inet_ops → && >> >>= > >= <> << <<= < <= =
– important improvement in SP-GiST AM: # of child nodes is limited
27. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex.
=# CREATE TABLE network_a AS SELECT ((random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '/' ||
=# (random() * 32)::int::text)::inet as addr
=# FROM generate_series(1, 1000);
CREATE
=# CREATE INDEX gist_idx ON network_a USING gist (addr inet_ops);
CREATE
=# CREATE INDEX spgist_idx_a ON network_a USING spgist (addr inet_ops);
CREATE
=# CREATE TABLE network_b AS (
=# SELECT * FROM network_a ORDER BY random() LIMIT 100);
CREATE
28. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. no indexes
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.00..15032.50 rows=78724 width=14)
(actual time=0.017..185.134 rows=94973 loops=1)
Join Filter: (a.addr && b.addr)
Rows Removed by Join Filter: 905027
-> Seq Scan on network_a a (cost=0.00..15.00 rows=1000 width=7)
(actual time=0.008..0.187 rows=1000 loops=1)
-> Materialize (cost=0.00..20.00 rows=1000 width=7)
(actual time=0.000..0.061 rows=1000 loops=1000)
-> Seq Scan on network_b b (cost=0.00..15.00 rows=1000 width=7)
(actual time=0.005..0.083 rows=1000 loops=1)
Planning time: 0.522 ms
Execution time: 190.120 ms
(8 rows)
29. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. GiST index
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.14..631.40 rows=13600 width=39)
(actual time=0.048..112.023 rows=94973 loops=1)
-> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32)
(actual time=0.016..0.153 rows=1000 loops=1)
-> Index Only Scan using gist_idx_a on network_a a
(cost=0.14..0.35 rows=10 width=7)
(actual time=0.018..0.093 rows=95 loops=1000)
Index Cond: (addr && a.addr)
Heap Fetches: 94973
Planning time: 0.111 ms
Execution time: 119.433 ms
(7 rows)
30. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. SP-GiST index
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.14..667.40 rows=13600 width=39)
(actual time=0.034..58.196 rows=94973 loops=1)
-> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32)
(actual time=0.009..0.105 rows=1000 loops=1)
-> Index Only Scan using spgist_idx_a on network_a a
(cost=0.14..0.37 rows=10 width=7)
(actual time=0.008..0.042 rows=95 loops=1000)
Index Cond: (addr && a.addr)
Heap Fetches: 94973
Planning time: 0.109 ms
Execution time: 63.562 ms
(7 rows)
31. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
BRIN summarization for newBRIN summarization for new INSERTINSERTss
• pre PG 10: perform VACUUM, or call brin_summarize_new_value()
• NOW (Á. Herrera):
– autovacuum daemon is now able to summarize now data in present ranges:
• CREATE INDEX ON table USING brin (column) WITH (autosummarize=on);
– It is possible to summarize/desummarized single blocks (bigint):
• brin_summarize_range / brin_desummarize_range
• BRIN are (still) not able to “shrinks” summarized data
– if you update/delete boundary data, need to REINDEX
32. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Other features about indexesOther features about indexes
• Improve hash index performance
(A. Kapila, M. Cy, A. Sharma)
• Improve accuracy in determining if a BRIN index scan is beneficial
(D. Rowley, E. Hasegeli)
• Allow faster GiST INSERTs/UPDATEs by reusing index space efficiently
(A. Borodin)
• Reduce page locking during vacuuming of GIN indexes
(A. Borodin)
33. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The future of indexes in PostgreSQLThe future of indexes in PostgreSQL
• Allow compression/decompression AM functions in SP-GiST
OpClasses (good for PostGIS!)
• CREATE GLOBAL INDEX
34. PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
ConclusionsConclusions
• PostgreSQL has a long tradition in indexes development
• different types for different goals
• an eye to the future