How do databases perform live backups and point-in-time recovery

Databases: zero-downtime
backup & point-in-time
recovery

INTRODUCTION
Bartosz Sypytkowski
▪ @Horusiath
▪ b.sypytkowski@gmail.com
▪ bartoszsypytkowski.com

 How databases persist data on disk
 How to backup changes
 How to restore database
 Optimizations and tricks
AGENDA

B+TREE
P1
k6 k8 k10 k11
P2
k2 k4 k6
P3
k7 v7 k8 v8
P4
k9 v9 k10 v10
P5
k11 v11
P6
k1 v1 k2 v2
P7
k3 v3 k4 v4
P8
k5 v5 k6 v6

B+TREE
ON DISK
P1
P2 P3 P8 P9
P5 P6 P10
P1 P2 P3 P5 P6 P8 P9
Database file
P10
Free pages
Root page
4KiB 4KiB 4KiB
4KiB 4KiB
4KiB

COMPAR
ING
VECTOR
CLOCKS
SQLITE
ARCHITECTURE
Core
Backend
Compiler
Interface
SQL Command
Processor
Virtual Machine
B+Tree
Pager
Virtual File System
Tokenizer
Parser
Code Generator
Rollback
Journal
Write-Ahead
Log

SQLITE
ROLLBACK
JOURNAL P1
P2 P3
UPDATE users SET name = ‘Joe’ WHERE id = 1;
B+Tree
Pager
Database
File
P1
P2
P3
Rollback
Journal

SQLITE
ROLLBACK
JOURNAL P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Rollback
Journal
Locate page
with the
record

SQLITE
ROLLBACK
JOURNAL P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Rollback
Journal
P2
Copy pages to
Rollback
Journal
P1

SQLITE
ROLLBACK
JOURNAL P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Rollback
Journal
P2
P1
Modify
affected
pages

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log
Locate page
with the
record

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log
P1
P2
Write new
page versions

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log
P1
P2
Pager
redirects
pages to new
versions

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
PRAGMA checkpoint(TRUNCATE);
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log
P1
P2

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log
P1
P2
Override DB
file with log
data

SQLITE
WRITE-AHEAD
LOG P1
P2 P3
B+Tree
Pager
Database
File
P1
P2
P3
Write-Ahead
Log

EVERY CHANGED PAGE MUST BE
FIRST APPENDED AT THE END OF
THE WRITE-AHEAD LOG FILE

EVERY CHANGE IN THE DATABASE
FILE COMES FROM READING
FRONT-TO-BACK WRITE-AHEAD
LOG FILE
EVERY CHANGED PAGE MUST BE
FIRST APPENDED AT THE END OF
THE WRITE-AHEAD LOG FILE

BACKING UP CHANGED PAGES
101
Write-Ahead Log
Database process
Backup service Backup drive

101
Write-Ahead Log
Database process
P3
F1

101
Write-Ahead Log
Database process
P3
F1 P4
F2

101
Write-Ahead Log
Database process
P3
F1 P4
F2 P3
F3

101
Write-Ahead Log
Database process
P3
F1 P4
F2 P3
F3 P1
F4

101
Write-Ahead Log
Database process
P3
F1 P4
F2 P3
F3
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4

POINT-IN-TIME RECOVRERY
101
Write-Ahead Log
Database process
Backup service Backup drive F1-F3-2024/02/10/07:54:00
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
restore(2024/02/11/10:00:00)

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1 P3
F1 P3
F1

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1 P3
F1 P3
F1
P1
F4
P2
F5
F4-F5-2024/02/10/07:54:10

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1 P3
F1 P3
F1 P1
F4 P2
F5

101
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1 P3
F1 P3
F1 P1
F4 P2
F5
PRAGMA CHECKPOINT(TRUNCATE);

COMPAR
ING
VECTOR
CLOCKS
WRITE-AHEAD
LOG FRAMES
https://www.sqlite.org/fileformat.html#the_write_ahead_log
Magic number File version
Database page size Checkpoint Seq. No.
0 4B 8B
Salt
Checksum
WAL HEADER
Page number DB size after commit
Salt
Checksum
Page data (4KiB)
WAL FRAME 1
WAL FRAME 2

COMPAR
ING
VECTOR
CLOCKS
WRITE-AHEAD
LOG FRAMES
https://www.sqlite.org/fileformat.html#the_write_ahead_log
Magic number File version
Database page size Checkpoint Seq. No.
0 4B 8B
Salt
Checksum
Salt
Checksum
Page data (4KiB)
if NOT 0, current
frame marks
transaction commit

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
BEGIN DEFERRED;
T1
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1
INSERT INTO t(name) VALUES(‘John Doe’);
size_after=0
T1
T1
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1
COMMIT;
P1
F2
size_after=0 size_after=3
T1
T1
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1 P1
F2
T1
BEGIN DEFERRED;
T2
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1 P1
F2
T1
T2
UPDATE t SET banned = 1;
P3
F3 P1
F4
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1 P1
F2
T1
P3
F3 P1
F4
ROLLBACK;
EOF

TRANSACTION
COMMIT & ROLLBACK
Write-Ahead Log
Database process
P3
F1 P1
F2
T1
P3
F3 P1
F4
ROLLBACK;
EOF
what if we already backed up these
frames? :/

DON’T BACKUP FRAMES THAT WERE NOT CONFIRMED AS
COMMITTED

POINT-IN-TIME RECOVRERY PROBLEM
Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
restore(2024/02/11/10:00:00)

Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P3
F1 P3
F1 P3
F1 P1
F4 P2
F5
F1-F3-2024/02/10/07:54:00
F4-F5-2024/02/10/07:54:10

Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
INSERT INTO t(name)
VALUES(‘John Doe’);

Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P5
F1 P1
F2

Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P5
F1 P1
F2
P5
F1
P1
F2
F1-F2-2024/02/13/09:30:00

Write-Ahead Log
Database process
F4-F5-2024/02/10/07:54:10
F6-F7-2024/02/11/11:02:00
F8-F9-2024/02/12/20:30:00
P5
F1 P1
F2
P5
F1
P1
F2
F1-F2-2024/02/13/09:30:00
Database history is no longer linear!

GENERATIONS
SPLIT LOG HISTORY INTO SESSIONS

COMPAR
ING
VECTOR
CLOCKS
GENERATIONS
G1/F1-F3-2024/02/10/07:54:00
Backup drive
G1/.snapshot
G1/F4-F6-2024/02/10/09:30:00
G1/F7-F8-2024/02/11/03:11:00
G1/F9-F10-2024/02/12/14:44:10
G2/.parent
G2/F1-F2-2024/02/11/04:39:13
G2/F3-F8-2024/02/12/05:12:55
G1
G2

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
Visited pages:
P3
P3
H1
Delta file
Artificial Delta
Header

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
Visited pages:
P2 P3
P3
H1
Delta file
P2
H1

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
Visited pages:
P1 P2 P3
P3
H1
Delta file
P2
H1
P1
H3

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
Visited pages:
P1 P2 P3
Skip P3 since it’s
already present in
visited pages
P3
H1
Delta file
P2
H1
P1
H3

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
Visited pages:
P1 P2 P3 P4
P3
H1
Delta file
P2
H1
P1
H3
P4
H4

DELTAS
Backup drive
P3
F1
P4
F2
P3
F3
F1-F3-2024/02/10/07:54:00
P1
F4
P2
F5
P3
F6
F4-F6-2024/02/10/09:33:00
Compacting process
P3
H1
Delta file
Visited pages:
P1 P2 P3 P4
P2
H1
P1
H3
P4
H4
already present in
visited pages

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F1-F13-2024/02/10/12:02:00
Latest delta can be
copied as is
Visited pages:
P1 P3 P4 P5

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F1-F13-2024/02/10/12:02:00
Visited pages:
P1 P3 P4 P5
already present in
visited pages

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F1-F13-2024/02/10/12:02:00
Visited pages:
P1 P2 P3 P4 P5
P2
H1

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00
F1-F13-2024/02/10/12:02:00
Visited pages:
P1 P2 P3 P4 P5
already present in
visited pages
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
P2
H1

COMPAR
ING
VECTOR
CLOCKS
DELTA MERGE
P3
H1
Delta
file
P2
H1
P1
H3
P4
H4
F1-F6-2024/02/10/09:33:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F7-F13-2024/02/10/12:02:00
P5
H1
Delta
file
P1
H1
P3
H3
P4
H4
F1-F13-2024/02/10/12:02:00
Visited pages:
P1 P3 P4 P5
already present in
visited pages

 How does continuous backup and point-in-time recovery work in databases:
https://www.bartoszsypytkowski.com/db-backup-point-in-time-recovery
 Litestream: https://litestream.io/
 SQLite write-ahead log docs: https://www.sqlite.org/wal.html
REFERENCES

How do databases perform live backups and point-in-time recovery

Recomendados

Recomendados

Más contenido relacionado

Similar a How do databases perform live backups and point-in-time recovery

Similar a How do databases perform live backups and point-in-time recovery (20)

Más de Bartosz Sypytkowski

Más de Bartosz Sypytkowski (14)

Último

Último (20)

How do databases perform live backups and point-in-time recovery