SlideShare a Scribd company logo
1 of 40
Download to read offline
Ace it
with ACID
Oleksii Kliukin
@hintbits
PostgreSQL
transactions for fun
and profit
2018-07-12T09:30:00+02:00
1
WE LOVE FASHION
2
WE OFFER A SUCCESSFUL AND CURATED ASSORTMENT
> 300,000
articles from
~ 2,000
international brands
17 private
labels
HIGHLY
EXPERIENCED
category management
> 500
designers
& stylistsLOCALIZATION
of the assortment
CURATED
SHOPPING
with Zalon
3
WE DRESS CODE
4
“...we should talk about
transactional systems rather
than storage when we want to talk
about RDBMS and other
transaction technologies.”
https://masteringpostgresql.com
d
5
6
A tomicity
C onsistency
I solation
D urability
() () ()
6
Every command in PostgreSQL
starts as a part of an (implicit)
transaction.
SQL transaction blocks with BEGIN
… COMMIT to explicitly start
transactions (usually to group
multiple statements).
7
FIND MY BIKE
8
CREATE SCHEMA findmybike;
SET SEARCH_PATH to findmybike, public;
CREATE TABLE bike (
b_id BIGSERIAL PRIMARY KEY,
b_owner BIGINT NOT NULL REFERENCES rider (r_id),
b_description TEXT,
b_photo_path TEXT
);
CREATE TABLE bike_location (
bl_bike_id BIGSERIAL REFERENCES bike (b_id),
bl_location POINT NOT NULL,
bl_last_seen TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE TABLE rider (
r_id BIGSERIAL PRIMARY KEY,
r_name TEXT NOT NULL,
r_email TEXT UNIQUE NOT NULL,
r_backup_email TEXT UNIQUE NOT NULL,
r_password TEXT NOT NULL
); 9
psql -qh localhost -U owner -d demo -f pgday.amsterdam/findmybike.sql
psql:pgday.amsterdam/findmybike.sql:10: ERROR: relation "rider" does
not exist
psql -q -h localhost -U owner -d demo -c "dt+ findmybike.*"
List of relations
Schema | Name | Type | Owner | Size | Description
------------+-------+-------+-------+------------+-------------
findmybike | rider | table | owner | 8192 bytes |
(1 row)
Non-atomic changes
10
psql -qh localhost -U owner -d demo -1f pgday.amsterdam/findmybike.sql
psql:pgday.amsterdam/findmybike.sql:10: ERROR: relation "rider" does
not exist
psql:pgday.amsterdam/findmybike.sql:16: ERROR: current transaction is
aborted, commands ignored until end of transaction block
psql -q -h localhost -U owner -d demo -c "dt+ findmybike.*"
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+------+-------+------+-------------
(0 rows)
Atomic changes and rollbacks
11
MVCC and rollbacks
dsd
xmin: 100,
xmax: 104
bl_bike_id: 1
bl_location:
52.526555,
13.408593
UPDATE
bike_location
SET
bl_location =
(52.527159, 13.396823)
WHERE bl_bike_id = 1
xmin: 104,
xmax: 0
bl_bike_id: 1
bl_location:
52.527159,
13.396823
XID: 104
12
COMMIT LOG
XID: 104, rollback
MVCC and commits
dsd
xmin: 100,
xmax: 104
bl_bike_id: 1
bl_location:
52.526555,
13.408593
UPDATE
bike_location
SET
bl_location =
(52.527159, 13.396823)
WHERE bl_bike_id = 1
xmin: 104,
xmax: 0
bl_bike_id: 1
bl_location:
52.527159,
13.396823
XID: 104
13
COMMIT LOG
XID: 104, commit
pg_dump -h localhost -U robot_backup -d demo -f dump.sql
select pid, state, xact_start, query from pg_stat_activity where usename='robot_backup';
-[ RECORD 1
]--------------------------------------------------------------------------------
pid | 13719
state | active
xact_start | 2018-07-10 13:22:38.07543+02
query | COPY findmybike.bike_location (bl_bike_id, bl_location, bl_last_seen) TO
stdout;
select pid, state, xact_start, query from pg_stat_activity where usename='robot_backup';
-[ RECORD 1
]------------------------------------------------------------------------------------
pid | 13719
state | active
xact_start | 2018-07-10 13:22:38.07543+02
query | COPY findmybike.rider (r_id, r_name, r_email, r_backup_email, r_password) TO
stdout;
Database dump: single transaction () () ()
14
pg_dump -h localhost -d demo -U robot_backup -Fd -j3 -f
backup_directory
Database dump: multi-process
15
EXPORTED SNAPSHOT
(xmin: 123, xmax: 135,
xip_list: 123,126,127
() () ()
W
O
R
K
E
R
2
W
O
R
K
E
R
3
W
O
R
K
E
R
1
T
X
I
D
1
2
3
T
X
I
D
1
2
6
T
X
I
D
1
2
7
C
O
O
R
D
I
N
A
T
O
R
SYNCHRONIZED SNAPSHOTS
16
BEGIN;
DELETE FROM bike_location
USING bike, owner
WHERE bl_bike_id = b_id AND b_owner = r_id AND r_email = 'nolongervalid@example.com';
DELETE FROM bike
WHERE b_owner = r_id AND r_email = 'nolongervalid@example.com';
DELETE FROM rider
WHERE r_email = 'nolongervalid@example.com';
COMMIT;
-- there is a better way of executing those statements at once using a CTE
Atomic data changes
17
BEGIN;
CREATE EXTENSION pg_trgm;
CREATE INDEX bike_b_description_trgm_idx
ON bike USING gin(b_description gin_trgm_ops);
EXPLAIN ANALYZE SELECT * FROM bike
WHERE b_description LIKE '%orange%';
ROLLBACK;
Transactional DDL: testing indexes
18
BEGIN;
ALTER TABLE rider ADD COLUMN r_phone TEXT;
CREATE TABLE bike_parking (
bp_id INTEGER PRIMARY KEY,
bp_location POINT NOT NULL,
bp_name TEXT NOT NULL,
bp_DESCRIPTION TEXT
);
ALTER TABLE bike ADD COLUMN b_parking_id REFERENCES bike_parking (bp_id);
INSERT INTO versioning.changes (name, description, last_modified, modified_by)
VALUES ('FINDMYBIKE-42','Add bike parking and rider phone', now(), current_user);
ROLLBACK; -- or COMMIT
Transactional DDL: tracking changes
19
EXPLAIN ANALYZE dangerous
queries
BEGIN;
EXPLAIN ANALYZE WITH
rider_to_delete AS (
DELETE FROM rider
WHERE r_email = 'rider_1@example.com'
RETURNING r_id
),
bike_to_delete AS (
DELETE FROM bike
USING rider_to_delete rtd
WHERE b_owner = rtd.r_id
RETURNING b_id
)
DELETE FROM bike_location
USING bike_to_delete btd
WHERE bl_bike_id = btd.b_id;
ROLLBACK; 20
EXPLAIN ANALYZE dangerous
queries
Delete on bike_location (cost=636622.13..636630.17 rows=1 width=38) (actual time=35338.529..35338.529 rows=0 loops=1)
CTE rider_to_delete
-> Delete on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.354..1.358 rows=1 loops=1)
-> Index Scan using rider_r_email_key on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.325..1.327 rows=1
loops=1)
Index Cond: (r_email = 'rider_1@example.com'::text)
CTE bike_to_delete
-> Delete on bike (cost=0.03..636613.12 rows=1 width=38) (actual time=2.991..35335.634 rows=3 loops=1)
-> Hash Join (cost=0.03..636613.12 rows=1 width=38) (actual time=2.985..35335.597 rows=3 loops=1)
Hash Cond: (bike.b_owner = rtd.r_id)
-> Seq Scan on bike (cost=0.00..561385.60 rows=20060660 width=14) (actual time=0.014..25716.838
rows=20000008 loops=1)
-> Hash (cost=0.02..0.02 rows=1 width=40) (actual time=2.953..2.953 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.359..1.364 rows=1
loops=1)
-> Nested Loop (cost=0.44..8.48 rows=1 width=38) (actual time=9315.484..35338.516 rows=1 loops=1)
-> CTE Scan on bike_to_delete btd (cost=0.00..0.02 rows=1 width=40) (actual time=2.996..35335.647 rows=3 loops=1)
-> Index Scan using bike_location_pkey on bike_location (cost=0.44..8.46 rows=1 width=14) (actual time=0.952..0.952
rows=0 loops=3)
Index Cond: (bl_bike_id = btd.b_id)
Planning time: 1.057 ms
Trigger for constraint bike_b_owner_fkey on rider: time=26197.948 calls=1
Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.914 calls=3
Execution time: 61537.489 ms 21
EXPLAIN ANALYZE dangerous
queries
-> Delete on bike (cost=0.03..636613.12 rows=1 width=38) (actual
time=2.991..35335.634 rows=3 loops=1)
-> Hash Join (cost=0.03..636613.12 rows=1 width=38) (actual
time=2.985..35335.597 rows=3 loops=1)
Hash Cond: (bike.b_owner = rtd.r_id)
-> Seq Scan on bike (cost=0.00..561385.60 rows=20060660 width=14)
(actual time=0.014..25716.838 rows=20000008 loops=1)
-> Hash (cost=0.02..0.02 rows=1 width=40) (actual time=2.953..2.953
rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1
width=40) (actual time=1.359..1.364 rows=1 loops=1)
...
Planning time: 1.057 ms
Trigger for constraint bike_b_owner_fkey on rider: time=26197.948 calls=1
Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.914 calls=3
Execution time: 61537.489 ms
22
CREATE TABLE bike (
b_id BIGSERIAL PRIMARY KEY,
b_owner BIGINT NOT NULL REFERENCES rider (r_id),
b_description TEXT,
b_photo_path TEXT
);
CREATE TABLE rider (
r_id BIGSERIAL PRIMARY KEY,
r_name TEXT NOT NULL,
r_email TEXT UNIQUE NOT NULL,
r_backup_email TEXT UNIQUE NOT NULL,
r_password TEXT NOT NULL
);
-- new index to speed up the delete query.
-- Note: cannot be part of a transaction
CREATE INDEX CONCURRENTLY bike_b_owner_idx ON bike(b_owner);
23
Delete on bike_location (cost=17.63..25.67 rows=1 width=38) (actual time=4.080..4.080 rows=0 loops=1)
CTE rider_to_delete
-> Delete on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.840..1.843 rows=1 loops=1)
-> Index Scan using rider_r_email_key on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.802..1.803 rows=1
loops=1)
Index Cond: (r_email = 'rider_1@example.com'::text)
CTE bike_to_delete
-> Delete on bike (cost=0.56..8.61 rows=1 width=38) (actual time=1.900..1.937 rows=3 loops=1)
-> Nested Loop (cost=0.56..8.61 rows=1 width=38) (actual time=1.894..1.915 rows=3 loops=1)
-> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.857..1.860 rows=1
loops=1)
-> Index Scan using bike_b_owner_idx on bike (cost=0.56..8.58 rows=1 width=14) (actual time=0.033..0.047
rows=3 loops=1)
Index Cond: (b_owner = rtd.r_id)
-> Nested Loop (cost=0.44..8.48 rows=1 width=38) (actual time=4.061..4.075 rows=1 loops=1)
-> CTE Scan on bike_to_delete btd (cost=0.00..0.02 rows=1 width=40) (actual time=1.907..1.949 rows=3 loops=1)
-> Index Scan using bike_location_pkey on bike_location (cost=0.44..8.46 rows=1 width=14) (actual time=0.706..0.706
rows=0 loops=3)
Index Cond: (bl_bike_id = btd.b_id)
Planning time: 1.382 ms
Trigger for constraint bike_b_owner_fkey on rider: time=0.392 calls=1
Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.437 calls=3
Execution time: 4.999 ms
(19 rows)
EXPLAIN ANALYZE dangerous
queries
24
-> Delete on bike (cost=0.56..8.61 rows=1 width=38) (actual time=1.900..1.937 rows=3
loops=1)
-> Nested Loop (cost=0.56..8.61 rows=1 width=38) (actual time=1.894..1.915
rows=3 loops=1)
-> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40)
(actual time=1.857..1.860 rows=1 loops=1)
-> Index Scan using bike_b_owner_idx on bike (cost=0.56..8.58 rows=1
width=14) (actual time=0.033..0.047 rows=3 loops=1)
Index Cond: (b_owner = rtd.r_id)
Planning time: 1.382 ms
Trigger for constraint bike_b_owner_fkey on rider: time=0.392 calls=1
Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.437 calls=3
Execution time: 4.999 ms
(19 rows)
EXPLAIN ANALYZE dangerous
queries
25
psql -h localhost -U robot_backup -d demo
# BEGIN;
# INSERT INTO bike (b_owner, b_description) VALUES (1, 'test');
INSERT 0 1
# INSERT INTO bike (b_owner, b_description) VALUES (2 'test2');
ERROR: syntax error at or near "'test2'" at character 53
insert INTO bike(b_owner, b_description) VALUES (2, 'test2');
ERROR: current transaction is aborted, commands ignored until end of
transaction block
ROLLBACK;
Correcting errors interactively with
ON_ERROR_ROLLBACK
26
psql -h localhost -U robot_backup -d demo -v
ON_ERROR_ROLLBACK=interactive
# BEGIN;
# INSERT INTO bike (b_owner, b_description) VALUES (1, 'test');
INSERT 0 1
# INSERT INTO bike (b_owner, b_description) VALUES (2 'test2');
ERROR: syntax error at or near "'test2'" at character 53
INSERT INTO bike (b_owner, b_description) VALUES (2, 'test2');
INSERT 0 1
COMMIT;
Correcting errors interactively with
ON_ERROR_ROLLBACK
27
psql -h localhost -U robot_backup -d demo
# BEGIN;
# SAVEPOINT statement1;
# INSERT INTO bike (b_owner, b_description) VALUES (1, 'test');
INSERT 0 1
# RELEASE statement1;
# SAVEPOINT statement2;
# INSERT INTO bike (b_owner, b_description) VALUES (2 'test2');
ERROR: syntax error at or near "'test2'" at character 53
# ROLLBACK TO statement2;
INSERT INTO bike (b_owner, b_description) VALUES (2, 'test2');
INSERT 0 1
COMMIT;
Correcting errors interactively with
subtransactions
28
Performing batch updates
-- session 1
BEGIN;
UPDATE bike SET b_photo_path = '/data/'||b_photo_path;
-- session 2
UPDATE bike SET b_description = 'my awesome bike' WHERE b_id = 1000000;
29
CREATE OR REPLACE FUNCTION findmybike.batch_change_path(p_new_prefix TEXT, p_batch_size INTEGER)
RETURNS VOID AS
$$
BEGIN
WHILE EXISTS(SELECT 1
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP
WITH keys_to_update AS (
SELECT b_id
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%'
LIMIT p_batch_size
) UPDATE bike b
SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu
WHERE b.b_id = ktu.b_id;
END LOOP;
END;
$$
LANGUAGE plpgsql;
Performing batch updates (via a function)
30
WHILE EXISTS(SELECT 1
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP
WITH keys_to_update AS (
SELECT b_id
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%'
LIMIT p_batch_size
) UPDATE bike b
SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu
WHERE b.b_id = ktu.b_id;
END LOOP;
Performing batch updates (via a function)
31
CREATE INDEX bike_p_bike_path_idx ON bike(p_bike_path);
-- session 1
SELECT findmybike.batch_change_path('/data1', 100);
-- session 2
UPDATE bike SET b_description = 'my awesome bike' WHERE b_id = 14238019;
A function is always executed in a single transaction
32
CREATE OR REPLACE PROCEDURE findmybike.batch_change_path(p_new_prefix TEXT, p_batch_size INTEGER)
AS
$$
BEGIN
WHILE EXISTS(SELECT 1
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP
WITH keys_to_update AS (
SELECT b_id
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%'
LIMIT p_batch_size
) UPDATE bike b
SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu
WHERE b.b_id = ktu.b_id;
COMMIT;
END LOOP;
END;
$$
LANGUAGE plpgsql;
Performing batch updates (Postgres 11 procedure)
33
WHILE EXISTS(SELECT 1
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP
WITH keys_to_update AS (
SELECT b_id
FROM bike
WHERE b_photo_path NOT LIKE p_new_prefix || '%'
LIMIT p_batch_size
) UPDATE bike b
SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu
WHERE b.b_id = ktu.b_id;
COMMIT;
END LOOP;
Performing batch updates (Postgres 11 procedure)
34
Performing batch updates (Postgres 11 procedure)
select query, backend_xid, xact_start from pg_stat_activity where state = 'active' and pid
!= (select pg_backend_pid());
-[ RECORD 1 ]---------------------------------------
query | call batch_change_path('/data1', 100);
backend_xid | 117913
xact_start | 2018-07-11 16:01:07.532973+02
select query, backend_xid, xact_start from pg_stat_activity where state = 'active' and pid
!= (select pg_backend_pid());
-[ RECORD 1 ]---------------------------------------
query | call batch_change_path('/data1', 100);
backend_xid | 118814
xact_start | 2018-07-11 16:01:07.532973+02
35
...we believe it is better to have application
programmers deal with performance problems due to
overuse of transactions as bottlenecks arise, rather than
always coding around the lack of transactions
Spanner: Google’s Globally-Distributed Database paper
36
“HIRE THE BEST PEOPLE YOU CAN, AND GET OUT OF THEIR WAY.“
37
Thank you!
alexk@hintbits.com
@hintbits
38
Snapshots and visibility
dsd
xmin: 100,
xmax: 104
bl_bike_id: 1
bl_location:
52.526555,
13.408593
UPDATE
bike_location
SET
bl_location =
(52.527159, 13.396823)
WHERE bl_bike_id = 1
xmin: 104,
xmax: 106
bl_bike_id: 1
bl_location:
52.527159,
13.396823
UPDATE
bike_location
SET
bl_location =
(52.569563, 13.403735)
WHERE bl_bike_id = 1
xmin: 106,
xmax: 0
bl_bike_id: 1
bl_location:
52.569563,
13.403735
XID: 104 XID: 106
39
Snapshots and visibility
dsd
xmin: 100,
xmax: 104
bl_bike_id: 1
xmin: 104,
xmax: 106
bl_bike_id: 1
xmin: 106,
xmax: 0
bl_bike_id: 1
XID: 104 XID: 106
SELECT bl_location FROM bike_location WHERE bl_bike_id = 1;
{ xid: 105 snapshot xmin: 99, xmax: 106 }, isolation: REPEATABLE READ.
40
UPDATE
bike_location
SET
bl_location =
(52.527159, 13.396823)
WHERE bl_bike_id = 1
UPDATE
bike_location
SET
bl_location =
(52.569563, 13.403735)
WHERE bl_bike_id = 1
bl_location:
52.526555,
13.408593
bl_location:
52.527159,
13.396823
bl_location:
52.569563,
13.403735

More Related Content

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

PGDay.Amsterdam 2018 - Oleksi Kliukin - Ace it with ACID: Postgres transactions for fun and profit

  • 1. Ace it with ACID Oleksii Kliukin @hintbits PostgreSQL transactions for fun and profit 2018-07-12T09:30:00+02:00 1
  • 3. WE OFFER A SUCCESSFUL AND CURATED ASSORTMENT > 300,000 articles from ~ 2,000 international brands 17 private labels HIGHLY EXPERIENCED category management > 500 designers & stylistsLOCALIZATION of the assortment CURATED SHOPPING with Zalon 3
  • 5. “...we should talk about transactional systems rather than storage when we want to talk about RDBMS and other transaction technologies.” https://masteringpostgresql.com d 5
  • 6. 6 A tomicity C onsistency I solation D urability () () () 6
  • 7. Every command in PostgreSQL starts as a part of an (implicit) transaction. SQL transaction blocks with BEGIN … COMMIT to explicitly start transactions (usually to group multiple statements). 7
  • 9. CREATE SCHEMA findmybike; SET SEARCH_PATH to findmybike, public; CREATE TABLE bike ( b_id BIGSERIAL PRIMARY KEY, b_owner BIGINT NOT NULL REFERENCES rider (r_id), b_description TEXT, b_photo_path TEXT ); CREATE TABLE bike_location ( bl_bike_id BIGSERIAL REFERENCES bike (b_id), bl_location POINT NOT NULL, bl_last_seen TIMESTAMP WITH TIME ZONE NOT NULL ); CREATE TABLE rider ( r_id BIGSERIAL PRIMARY KEY, r_name TEXT NOT NULL, r_email TEXT UNIQUE NOT NULL, r_backup_email TEXT UNIQUE NOT NULL, r_password TEXT NOT NULL ); 9
  • 10. psql -qh localhost -U owner -d demo -f pgday.amsterdam/findmybike.sql psql:pgday.amsterdam/findmybike.sql:10: ERROR: relation "rider" does not exist psql -q -h localhost -U owner -d demo -c "dt+ findmybike.*" List of relations Schema | Name | Type | Owner | Size | Description ------------+-------+-------+-------+------------+------------- findmybike | rider | table | owner | 8192 bytes | (1 row) Non-atomic changes 10
  • 11. psql -qh localhost -U owner -d demo -1f pgday.amsterdam/findmybike.sql psql:pgday.amsterdam/findmybike.sql:10: ERROR: relation "rider" does not exist psql:pgday.amsterdam/findmybike.sql:16: ERROR: current transaction is aborted, commands ignored until end of transaction block psql -q -h localhost -U owner -d demo -c "dt+ findmybike.*" List of relations Schema | Name | Type | Owner | Size | Description --------+------+------+-------+------+------------- (0 rows) Atomic changes and rollbacks 11
  • 12. MVCC and rollbacks dsd xmin: 100, xmax: 104 bl_bike_id: 1 bl_location: 52.526555, 13.408593 UPDATE bike_location SET bl_location = (52.527159, 13.396823) WHERE bl_bike_id = 1 xmin: 104, xmax: 0 bl_bike_id: 1 bl_location: 52.527159, 13.396823 XID: 104 12 COMMIT LOG XID: 104, rollback
  • 13. MVCC and commits dsd xmin: 100, xmax: 104 bl_bike_id: 1 bl_location: 52.526555, 13.408593 UPDATE bike_location SET bl_location = (52.527159, 13.396823) WHERE bl_bike_id = 1 xmin: 104, xmax: 0 bl_bike_id: 1 bl_location: 52.527159, 13.396823 XID: 104 13 COMMIT LOG XID: 104, commit
  • 14. pg_dump -h localhost -U robot_backup -d demo -f dump.sql select pid, state, xact_start, query from pg_stat_activity where usename='robot_backup'; -[ RECORD 1 ]-------------------------------------------------------------------------------- pid | 13719 state | active xact_start | 2018-07-10 13:22:38.07543+02 query | COPY findmybike.bike_location (bl_bike_id, bl_location, bl_last_seen) TO stdout; select pid, state, xact_start, query from pg_stat_activity where usename='robot_backup'; -[ RECORD 1 ]------------------------------------------------------------------------------------ pid | 13719 state | active xact_start | 2018-07-10 13:22:38.07543+02 query | COPY findmybike.rider (r_id, r_name, r_email, r_backup_email, r_password) TO stdout; Database dump: single transaction () () () 14
  • 15. pg_dump -h localhost -d demo -U robot_backup -Fd -j3 -f backup_directory Database dump: multi-process 15
  • 16. EXPORTED SNAPSHOT (xmin: 123, xmax: 135, xip_list: 123,126,127 () () () W O R K E R 2 W O R K E R 3 W O R K E R 1 T X I D 1 2 3 T X I D 1 2 6 T X I D 1 2 7 C O O R D I N A T O R SYNCHRONIZED SNAPSHOTS 16
  • 17. BEGIN; DELETE FROM bike_location USING bike, owner WHERE bl_bike_id = b_id AND b_owner = r_id AND r_email = 'nolongervalid@example.com'; DELETE FROM bike WHERE b_owner = r_id AND r_email = 'nolongervalid@example.com'; DELETE FROM rider WHERE r_email = 'nolongervalid@example.com'; COMMIT; -- there is a better way of executing those statements at once using a CTE Atomic data changes 17
  • 18. BEGIN; CREATE EXTENSION pg_trgm; CREATE INDEX bike_b_description_trgm_idx ON bike USING gin(b_description gin_trgm_ops); EXPLAIN ANALYZE SELECT * FROM bike WHERE b_description LIKE '%orange%'; ROLLBACK; Transactional DDL: testing indexes 18
  • 19. BEGIN; ALTER TABLE rider ADD COLUMN r_phone TEXT; CREATE TABLE bike_parking ( bp_id INTEGER PRIMARY KEY, bp_location POINT NOT NULL, bp_name TEXT NOT NULL, bp_DESCRIPTION TEXT ); ALTER TABLE bike ADD COLUMN b_parking_id REFERENCES bike_parking (bp_id); INSERT INTO versioning.changes (name, description, last_modified, modified_by) VALUES ('FINDMYBIKE-42','Add bike parking and rider phone', now(), current_user); ROLLBACK; -- or COMMIT Transactional DDL: tracking changes 19
  • 20. EXPLAIN ANALYZE dangerous queries BEGIN; EXPLAIN ANALYZE WITH rider_to_delete AS ( DELETE FROM rider WHERE r_email = 'rider_1@example.com' RETURNING r_id ), bike_to_delete AS ( DELETE FROM bike USING rider_to_delete rtd WHERE b_owner = rtd.r_id RETURNING b_id ) DELETE FROM bike_location USING bike_to_delete btd WHERE bl_bike_id = btd.b_id; ROLLBACK; 20
  • 21. EXPLAIN ANALYZE dangerous queries Delete on bike_location (cost=636622.13..636630.17 rows=1 width=38) (actual time=35338.529..35338.529 rows=0 loops=1) CTE rider_to_delete -> Delete on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.354..1.358 rows=1 loops=1) -> Index Scan using rider_r_email_key on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.325..1.327 rows=1 loops=1) Index Cond: (r_email = 'rider_1@example.com'::text) CTE bike_to_delete -> Delete on bike (cost=0.03..636613.12 rows=1 width=38) (actual time=2.991..35335.634 rows=3 loops=1) -> Hash Join (cost=0.03..636613.12 rows=1 width=38) (actual time=2.985..35335.597 rows=3 loops=1) Hash Cond: (bike.b_owner = rtd.r_id) -> Seq Scan on bike (cost=0.00..561385.60 rows=20060660 width=14) (actual time=0.014..25716.838 rows=20000008 loops=1) -> Hash (cost=0.02..0.02 rows=1 width=40) (actual time=2.953..2.953 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.359..1.364 rows=1 loops=1) -> Nested Loop (cost=0.44..8.48 rows=1 width=38) (actual time=9315.484..35338.516 rows=1 loops=1) -> CTE Scan on bike_to_delete btd (cost=0.00..0.02 rows=1 width=40) (actual time=2.996..35335.647 rows=3 loops=1) -> Index Scan using bike_location_pkey on bike_location (cost=0.44..8.46 rows=1 width=14) (actual time=0.952..0.952 rows=0 loops=3) Index Cond: (bl_bike_id = btd.b_id) Planning time: 1.057 ms Trigger for constraint bike_b_owner_fkey on rider: time=26197.948 calls=1 Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.914 calls=3 Execution time: 61537.489 ms 21
  • 22. EXPLAIN ANALYZE dangerous queries -> Delete on bike (cost=0.03..636613.12 rows=1 width=38) (actual time=2.991..35335.634 rows=3 loops=1) -> Hash Join (cost=0.03..636613.12 rows=1 width=38) (actual time=2.985..35335.597 rows=3 loops=1) Hash Cond: (bike.b_owner = rtd.r_id) -> Seq Scan on bike (cost=0.00..561385.60 rows=20060660 width=14) (actual time=0.014..25716.838 rows=20000008 loops=1) -> Hash (cost=0.02..0.02 rows=1 width=40) (actual time=2.953..2.953 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.359..1.364 rows=1 loops=1) ... Planning time: 1.057 ms Trigger for constraint bike_b_owner_fkey on rider: time=26197.948 calls=1 Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.914 calls=3 Execution time: 61537.489 ms 22
  • 23. CREATE TABLE bike ( b_id BIGSERIAL PRIMARY KEY, b_owner BIGINT NOT NULL REFERENCES rider (r_id), b_description TEXT, b_photo_path TEXT ); CREATE TABLE rider ( r_id BIGSERIAL PRIMARY KEY, r_name TEXT NOT NULL, r_email TEXT UNIQUE NOT NULL, r_backup_email TEXT UNIQUE NOT NULL, r_password TEXT NOT NULL ); -- new index to speed up the delete query. -- Note: cannot be part of a transaction CREATE INDEX CONCURRENTLY bike_b_owner_idx ON bike(b_owner); 23
  • 24. Delete on bike_location (cost=17.63..25.67 rows=1 width=38) (actual time=4.080..4.080 rows=0 loops=1) CTE rider_to_delete -> Delete on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.840..1.843 rows=1 loops=1) -> Index Scan using rider_r_email_key on rider (cost=0.56..8.58 rows=1 width=6) (actual time=1.802..1.803 rows=1 loops=1) Index Cond: (r_email = 'rider_1@example.com'::text) CTE bike_to_delete -> Delete on bike (cost=0.56..8.61 rows=1 width=38) (actual time=1.900..1.937 rows=3 loops=1) -> Nested Loop (cost=0.56..8.61 rows=1 width=38) (actual time=1.894..1.915 rows=3 loops=1) -> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.857..1.860 rows=1 loops=1) -> Index Scan using bike_b_owner_idx on bike (cost=0.56..8.58 rows=1 width=14) (actual time=0.033..0.047 rows=3 loops=1) Index Cond: (b_owner = rtd.r_id) -> Nested Loop (cost=0.44..8.48 rows=1 width=38) (actual time=4.061..4.075 rows=1 loops=1) -> CTE Scan on bike_to_delete btd (cost=0.00..0.02 rows=1 width=40) (actual time=1.907..1.949 rows=3 loops=1) -> Index Scan using bike_location_pkey on bike_location (cost=0.44..8.46 rows=1 width=14) (actual time=0.706..0.706 rows=0 loops=3) Index Cond: (bl_bike_id = btd.b_id) Planning time: 1.382 ms Trigger for constraint bike_b_owner_fkey on rider: time=0.392 calls=1 Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.437 calls=3 Execution time: 4.999 ms (19 rows) EXPLAIN ANALYZE dangerous queries 24
  • 25. -> Delete on bike (cost=0.56..8.61 rows=1 width=38) (actual time=1.900..1.937 rows=3 loops=1) -> Nested Loop (cost=0.56..8.61 rows=1 width=38) (actual time=1.894..1.915 rows=3 loops=1) -> CTE Scan on rider_to_delete rtd (cost=0.00..0.02 rows=1 width=40) (actual time=1.857..1.860 rows=1 loops=1) -> Index Scan using bike_b_owner_idx on bike (cost=0.56..8.58 rows=1 width=14) (actual time=0.033..0.047 rows=3 loops=1) Index Cond: (b_owner = rtd.r_id) Planning time: 1.382 ms Trigger for constraint bike_b_owner_fkey on rider: time=0.392 calls=1 Trigger for constraint bike_location_bl_bike_id_fkey on bike: time=0.437 calls=3 Execution time: 4.999 ms (19 rows) EXPLAIN ANALYZE dangerous queries 25
  • 26. psql -h localhost -U robot_backup -d demo # BEGIN; # INSERT INTO bike (b_owner, b_description) VALUES (1, 'test'); INSERT 0 1 # INSERT INTO bike (b_owner, b_description) VALUES (2 'test2'); ERROR: syntax error at or near "'test2'" at character 53 insert INTO bike(b_owner, b_description) VALUES (2, 'test2'); ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACK; Correcting errors interactively with ON_ERROR_ROLLBACK 26
  • 27. psql -h localhost -U robot_backup -d demo -v ON_ERROR_ROLLBACK=interactive # BEGIN; # INSERT INTO bike (b_owner, b_description) VALUES (1, 'test'); INSERT 0 1 # INSERT INTO bike (b_owner, b_description) VALUES (2 'test2'); ERROR: syntax error at or near "'test2'" at character 53 INSERT INTO bike (b_owner, b_description) VALUES (2, 'test2'); INSERT 0 1 COMMIT; Correcting errors interactively with ON_ERROR_ROLLBACK 27
  • 28. psql -h localhost -U robot_backup -d demo # BEGIN; # SAVEPOINT statement1; # INSERT INTO bike (b_owner, b_description) VALUES (1, 'test'); INSERT 0 1 # RELEASE statement1; # SAVEPOINT statement2; # INSERT INTO bike (b_owner, b_description) VALUES (2 'test2'); ERROR: syntax error at or near "'test2'" at character 53 # ROLLBACK TO statement2; INSERT INTO bike (b_owner, b_description) VALUES (2, 'test2'); INSERT 0 1 COMMIT; Correcting errors interactively with subtransactions 28
  • 29. Performing batch updates -- session 1 BEGIN; UPDATE bike SET b_photo_path = '/data/'||b_photo_path; -- session 2 UPDATE bike SET b_description = 'my awesome bike' WHERE b_id = 1000000; 29
  • 30. CREATE OR REPLACE FUNCTION findmybike.batch_change_path(p_new_prefix TEXT, p_batch_size INTEGER) RETURNS VOID AS $$ BEGIN WHILE EXISTS(SELECT 1 FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP WITH keys_to_update AS ( SELECT b_id FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%' LIMIT p_batch_size ) UPDATE bike b SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu WHERE b.b_id = ktu.b_id; END LOOP; END; $$ LANGUAGE plpgsql; Performing batch updates (via a function) 30
  • 31. WHILE EXISTS(SELECT 1 FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP WITH keys_to_update AS ( SELECT b_id FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%' LIMIT p_batch_size ) UPDATE bike b SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu WHERE b.b_id = ktu.b_id; END LOOP; Performing batch updates (via a function) 31
  • 32. CREATE INDEX bike_p_bike_path_idx ON bike(p_bike_path); -- session 1 SELECT findmybike.batch_change_path('/data1', 100); -- session 2 UPDATE bike SET b_description = 'my awesome bike' WHERE b_id = 14238019; A function is always executed in a single transaction 32
  • 33. CREATE OR REPLACE PROCEDURE findmybike.batch_change_path(p_new_prefix TEXT, p_batch_size INTEGER) AS $$ BEGIN WHILE EXISTS(SELECT 1 FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP WITH keys_to_update AS ( SELECT b_id FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%' LIMIT p_batch_size ) UPDATE bike b SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu WHERE b.b_id = ktu.b_id; COMMIT; END LOOP; END; $$ LANGUAGE plpgsql; Performing batch updates (Postgres 11 procedure) 33
  • 34. WHILE EXISTS(SELECT 1 FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%') LOOP WITH keys_to_update AS ( SELECT b_id FROM bike WHERE b_photo_path NOT LIKE p_new_prefix || '%' LIMIT p_batch_size ) UPDATE bike b SET b_photo_path = p_new_prefix || b_photo_path FROM keys_to_update ktu WHERE b.b_id = ktu.b_id; COMMIT; END LOOP; Performing batch updates (Postgres 11 procedure) 34
  • 35. Performing batch updates (Postgres 11 procedure) select query, backend_xid, xact_start from pg_stat_activity where state = 'active' and pid != (select pg_backend_pid()); -[ RECORD 1 ]--------------------------------------- query | call batch_change_path('/data1', 100); backend_xid | 117913 xact_start | 2018-07-11 16:01:07.532973+02 select query, backend_xid, xact_start from pg_stat_activity where state = 'active' and pid != (select pg_backend_pid()); -[ RECORD 1 ]--------------------------------------- query | call batch_change_path('/data1', 100); backend_xid | 118814 xact_start | 2018-07-11 16:01:07.532973+02 35
  • 36. ...we believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions Spanner: Google’s Globally-Distributed Database paper 36
  • 37. “HIRE THE BEST PEOPLE YOU CAN, AND GET OUT OF THEIR WAY.“ 37
  • 39. Snapshots and visibility dsd xmin: 100, xmax: 104 bl_bike_id: 1 bl_location: 52.526555, 13.408593 UPDATE bike_location SET bl_location = (52.527159, 13.396823) WHERE bl_bike_id = 1 xmin: 104, xmax: 106 bl_bike_id: 1 bl_location: 52.527159, 13.396823 UPDATE bike_location SET bl_location = (52.569563, 13.403735) WHERE bl_bike_id = 1 xmin: 106, xmax: 0 bl_bike_id: 1 bl_location: 52.569563, 13.403735 XID: 104 XID: 106 39
  • 40. Snapshots and visibility dsd xmin: 100, xmax: 104 bl_bike_id: 1 xmin: 104, xmax: 106 bl_bike_id: 1 xmin: 106, xmax: 0 bl_bike_id: 1 XID: 104 XID: 106 SELECT bl_location FROM bike_location WHERE bl_bike_id = 1; { xid: 105 snapshot xmin: 99, xmax: 106 }, isolation: REPEATABLE READ. 40 UPDATE bike_location SET bl_location = (52.527159, 13.396823) WHERE bl_bike_id = 1 UPDATE bike_location SET bl_location = (52.569563, 13.403735) WHERE bl_bike_id = 1 bl_location: 52.526555, 13.408593 bl_location: 52.527159, 13.396823 bl_location: 52.569563, 13.403735