Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 29 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a [Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오 (20)

Anuncio

Más de PgDay.Seoul (20)

Más reciente (20)

Anuncio

[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오

  1. 1. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. PostgreSQL WAL Buffers, Clog Buffers Deep Dive Version 9.4, 9.6 엑셈 | 연구컨텐츠팀
  2. 2. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Memory Architecture WAL Buffer, XLOG file CLOG Buffer INDEX
  3. 3. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Memory Architecture
  4. 4. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Client Processes Postgres Instance Server Processes System Memory Utility Process Storage Manager Database Cluster Client Application Client Interface Library (libpg) Postmaster (Daemon/Listener) Postgres Server (backend) Shared Buffer PG Shared Memory WAL Buffer CLOG Buffer Lock Space Other Buffers Buffer Manager Disk Manager Page Manager Semaphore & Shared Memory File Manager Lock Manager Sub Directory Configure File Lock File WAL Receiver WAL Sender Archiver Stats Collector Sys Logger BG Writer WAL Writer Autovacuum Launcher OS Cache PerBackend Memory • maintenance_work_mem • temp_buffer • work_mem • catalog_cache • optimizer/executor PostgreSQL Architecture
  5. 5. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. WAL Internal (version 9.6)
  6. 6. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. WAL(Write Ahead Log) 정의 데이터베이스의 데이터 파일에 대한 모든 변경 기록(트랜잭션 로그)을 보관함 사용 목적 서버가 중지되었을 때 체크포인트 작업이 되지 않아 데이터 파일에 적용하지 못한 경우, 이 로그에서 읽어서 그대로 다시 실행하여 서버를 안전하게 복구 특징 디스크 쓰기의 횟수를 줄여, 성능을 향상시킴  동기쓰기 : 데이터가 물리적 디스크에 기록될 때까지 처리를 기다림 (트랜잭션 로그) I/O 작업을 기다리면 처리 속도가 떨어짐.  비동기 쓰기 : 디스크에 대한 쓰기 요청만 하고 버퍼에 기록한 후 다음 번에 처리함, 결과를 기다리지 않음 (테이블, 인덱스 등의 데이터)
  7. 7. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. WAL segment files의 구조 • segment 파일 크기 16M • 로그 파일로 분할된 segment의 개수 64개 (wal_keep_segments) -> 논리 로그 파일의 크기 64 * 16 = 1024MB = 1GB • max_wal_size (1GB, 64 files) • min_wal_size (80MB, 5 files) source: http://blog.163.com/li_hx/blog/static/18399141320117984154925 Header Header Header Record 1 … Record K Header Header Header Record 1 … Record K Header Header Header Record 1 … Record K 총 2048 페이지 XLogLong PageHeaderData XLogPageHeaderData XLogRecord + XLogRecData
  8. 8. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. WAL segment의 구조 source: http://www.interdb.jp/pg/pgsql09.html#_9.4.2 00000001 00000000 00000001 00000001 00000000 00000002 00000001 00000000 000000FF 00000001 00000001 00000000 … 00000001 00000001 000000FF 00000001 00000002 00000000 000000010000000100000000 timelineld 00000001 0000000000000001 Logical ID WAL File # 00000001 00000001 00000000 transaction log (timelineld=1) 0x00000000/00000000 0xFFFFFFFF/FFFFFFFF 16(Mbyte) 000000010000000100000000 …… 00000001FFFFFFFF000000FF timelineld LSN=0x00000001/00002D3E 000000010000000000000001…0000000100000000000000FF
  9. 9. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLOG record XLOG record XLOG record XLOG record 8192(byte) 8192(byte) Xlog Record XLog Record Data Header Data XLogLongPageHeaderData XLogPageHeaderData 000000010000000100000000 16(Mbyte)
  10. 10. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogLongPage HeaderData XLogPageHeaderData XLog record XLog Record Header XLog record data 8KB XLogPageHea derData std standard header fields uint64 xlp_sysid system identifier from pg_control uint32 xlp_seg_size just as a cross-check uint32 xlp_xlog_blcksz just as a cross-check uint16 xlp_magic magic value for correctness checks uint16 xlp_info flag bits TimeLineID xlp_tli TimeLineID of first record on page XLogRecPtr xlp_pageaddr XLOG address of this page uint32 xlp_rem_len total len of remaining data for record XLogPageHeaderData XLogLongPageHeaderData
  11. 11. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. data portiongeneral header portion header part data part XLogRecord XLogRecordBlockHeader XLogRecordBlockHeader (Short/Long) 2 N1 XLogRecordBlockCompressHeader XLogRecordBlockImageHeader optional Block data blockdata 1 blockdata 2 blockdata N main data  변경된 이유: 9.4 버전까지 XLOG record에 대한 일정한 포맷이 없었음. 각 리소스 매니저에 의해서 각각의 포맷이 정의됨. 소스코드를 유지하는데 어려움이 증가함.
  12. 12. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogRecord pd_lsn …. 1 2 Tuple B Tuple A xl_heap_insert backup block (block data 1) XLogRecordDataHeaderShort XLogRecordBlockImageHeader XLogRecordBlockHeader main data 32524 2 XLogRecord BkpBlock pd_lsn … 1 2 header Tuple B data Tuple A XLog Record data 32 24 ~ 9.4 9.5 ~
  13. 13. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogRecord xl_heap_header Tuple B data xl_heap_insert XLOG record data 24 20 2 5 3 block data 1 main data XLogRecordDataHeaderShort XLogRecordBlockHeader XLogRecord xl_heap_insert xl_heap_header Tuple B data XLOG record data 32 24 6 ~ 9.4 9.5 ~
  14. 14. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogRecord CheckPoint XLOG record data 24 2 main data XLogRecordDataHeaderShort 80 XLogRecord CheckPoint 32 72 ~ 9.4 9.5 ~
  15. 15. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. pg_xlogdump rmgr: Heap len (rec/tot): 3/ 66, tx: 118903881, lsn: 159/D4000098, prev 159/D4000028, desc: INSERT+INIT off 1, blkref #0: rel 1663/13323/22530 blk 0 postgres=# insert into t6 values ('a'); Item Description rmgr 리소스 매니저 0 XLOG 1 Transaction 2 Storage 3 CLOG 4 Database 5 Tablespace 6 MultiXact 7 RelMap 8 Standby 9 Heap2 10 Heap 11 Btree 12 Hash 13 Gin 14 Gist 15 Sequence 16 SPGist len (rec) WAL 레코드 헤더 및 백업 블록을 포함하지 않은 WAL 레코드 길이 -> xl_heap_insert/delete/update len (tot) WAL 레코드의 총 길이 tx 트랜잭션 ID lsn logical ID / WAL segment number + block offset prev 바로 이전의 WAL 레코드 위치 (previous lsn) desc • 트랜잭션 정보 (insert, delete, update, truncate …) • relation의 정보 (tablespace/database/relfilenode)
  16. 16. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. rmgr: Heap len (rec/tot): 3/66, tx: 118903881, lsn: 159/D4000098, prev: 159/D4000028, desc: INSERT+INIT off 1, blkref #0: rel 1663/13323/22530 blk postgres=# insert into t6 values ('a'); 000090 00 00 00 00 00 00 00 00 42 00 00 00 49 54 16 07 >........B...IT..< 0000a0 28 00 00 d4 59 01 00 00 80 0a 00 00 68 ed fe 57 >(...Y.......h..W< 0000b0 00 60 11 00 7f 06 00 00 0b 34 00 00 02 58 00 00 >.`.......4...X..< 0000c0 00 00 00 00 ff 03 01 00 02 08 18 00 17 61 20 20 >.............a < 0000d0 20 20 20 20 20 20 20 01 00 00 00 00 00 00 00 00 > .........< 000000 59 01 00 00 e0 00 00 d4 00 00 00 00 1c 00 d8 1f >Y...............< 000010 00 20 04 20 00 00 00 00 d8 9f 46 00 00 00 00 00 >. . ......F.....< 000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................< * 001fd0 00 00 00 00 00 00 00 00 49 54 16 07 00 00 00 00 >........IT......< 001fe0 00 00 00 00 00 00 00 00 01 00 01 00 02 08 18 00 >................< 001ff0 17 61 20 20 20 20 20 20 20 20 20 00 00 00 00 00 >.a .....< 002000 Page (base/13323/22530) WAL Segment File
  17. 17. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogRecord Header XR Block Header XR Data Header Short xl_heap_ header Tuple data xl_heap_ insert 24 20 2 5 3 Member Size (byte) Description Value xl_tot_len 4 total len of entire record 42 00 00 00 xl_xid 4 transaction id 49 54 16 07 xl_prev 8 ptr to previous record in log 28 00 00 d4 59 01 00 00 xl_info 1 flag bits 80 xl_rmid 1 resource manager for this record 0a padding 2 . 00 00 xl_crc 4 CRC for this record 68 ed fe 57 XLog Record Header (24 Bytes) Member Size (byte) Description Value id 1 block reference ID 00 fork_flags 1 fork within the relation and flags 60 data_length 2 number of payload bytes (not including page image) 11 00 block ref 16 (4/4/8) (tablespace/database/relfilenode) 7f 06 00 00 0b 34 00 00 02 58 00 00 00 00 00 00 XLog Record Block Header (20 Bytes)
  18. 18. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Member Size (byte) Description Value id 1 XLR_BLOCK_ID_DATA_SHORT ff data_length 1 number of payload bytes (xl_heap_insert의 길이) 03 XLog Record Data Header Short (2 Bytes) Member Size (byte) Description Value t_infomask2 2 number of attributes + various flags 01 00 t_infomask 2 various flag bits 02 08 t_hoff 3 size of header incl. bitmap, padding 18 00 17 xl_heap_header (7 Bytes) Member Size (byte) Description Value offnum 2 inserted tuple’s offset 01 00 flags 1 XLH_INSERT_ALL_VISIBLE_CLEARED (1<<0) XLH_INSERT_LAST_IN_MULTI (1<<1) XLH_INSERT_IS_SPECULATIVE (1<<2) XLH_INSERT_CONTAINS_NEW_TUPLE (1<<3) 00 xl_heap_insert (3 Bytes)
  19. 19. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. rmgr: Heap len (rec/tot): 14/177, tx: 118903882, lsn: 159/D4000178, prev: 159/D4000108, desc: HOT_UPDATE off 1 xmax 118903882 ; new off 2 xmax 0, blkref #0: rel 1663/13323/22530 blk 0 FPW postgres=# update t6 set id = 'b' where id = 'a'; 000170 00 00 00 00 00 00 00 00 b1 00 00 00 4a 54 16 07 >............JT..< 000180 08 01 00 d4 59 01 00 00 40 0a 00 00 1e eb f4 f3 >....Y...@.......< 000190 00 10 00 00 70 00 20 00 01 7f 06 00 00 0b 34 00 >....p. .......4.< 0001a0 00 02 58 00 00 00 00 00 00 ff 0e 59 01 00 00 e0 >..X........Y....< 0001b0 00 00 d4 00 00 00 00 20 00 b0 1f 00 20 04 20 4a >....... .... . J< 0001c0 54 16 07 d8 9f 46 00 b0 9f 46 00 4a 54 16 07 00 >T....F...F.JT...< 0001d0 00 00 00 00 00 00 00 00 00 00 00 02 00 01 80 02 >................< 0001e0 28 18 00 17 62 20 20 20 20 20 20 20 20 20 00 00 >(...b ..< 0001f0 00 00 00 49 54 16 07 4a 54 16 07 00 00 00 00 00 >...IT..JT.......< 000200 00 00 00 02 00 01 40 02 01 18 00 17 61 20 20 20 >......@.....a < 000210 20 20 20 20 20 20 00 00 00 00 00 4a 54 16 07 01 > .....JT...< 000220 00 00 40 00 00 00 00 02 00 00 00 00 00 00 00 00 >..@.............< 000000 59 01 00 00 30 02 00 d4 00 00 00 00 20 00 b0 1f >Y...0....... ...< 000010 00 20 04 20 4a 54 16 07 d8 9f 46 00 b0 9f 46 00 >. . JT....F...F.< 000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................ < * 001fb0 4a 54 16 07 00 00 00 00 00 00 00 00 00 00 00 00 >JT..............< 001fc0 02 00 01 80 02 28 18 00 17 62 20 20 20 20 20 20 >.....(...b < 001fd0 20 20 20 00 00 00 00 00 49 54 16 07 4a 54 16 07 > .....IT..JT..< 001fe0 00 00 00 00 00 00 00 00 02 00 01 40 02 01 18 00 >...........@....< 001ff0 17 61 20 20 20 20 20 20 20 20 20 00 00 00 00 00 >.a .....< 002000 WAL Segment File Page (base/13323/22530) backup block
  20. 20. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. XLogRecord Header XR Block Header XR Block Image Header XR Data Header Short Backup Block xl_heap_ update 24 25 2 14 Member Size (byte) Description Value id 1 block reference ID 00 fork_flags 1 fork within the relation and flags 10 data_length 2 number of payload bytes (not including page image) 00 00 block ref 16 (4/4/8) (tablespace/database/relfilenode) 70 00 20 00 01 7f 06 00 00 0b 34 00 00 02 58 00 XLog Record Block Header (20 Bytes) Member Size (byte) Description Value length 2 number of page image bytes 00 00 hole_offset 2 number of bytes before “hole” 00 00 bimg_info 1 flag bits 0x01 BKPIMAGE_HAS_HOLE 0x02 BKPIMAGE_IS_COMPRESSED 00 XLog Record Block Image Header (5 Bytes)
  21. 21. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Member Size (byte) Description Value id 1 XLR_BLOCK_ID_DATA_SHORT ff data_length 1 number of payload bytes (xl_heap_update의 길이) 0e XLog Record Data Header Short (2 Bytes) Member Size (byte) Description Value old_xmax 4 xmax of the old tuple 4a 54 16 07 old_offnum 2 old tuple’s offset 01 00 old_infobits_set 1 infomask bits to set on old tuple 00 flags 1 /* PD_ALL_VISIBLE was cleared */ XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED (1<<0) /* PD_ALL_VISIBLE was cleared in the 2nd page */ XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED (1<<1) XLH_UPDATE_CONTAINS_OLD_TUPLE (1<<2) XLH_UPDATE_CONTAINS_OLD_KEY (1<<3) XLH_UPDATE_CONTAINS_NEW_TUPLE (1<<4) XLH_UPDATE_PREFIX_FROM_OLD (1<<5) XLH_UPDATE_SUFFIX_FROM_OLD (1<<6) 40 new_xmax 4 xmax of the new tuple 00 00 00 00 new_offnum 2 new tuple’s offset 02 00 xl_heap_update (14 Bytes)
  22. 22. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. CLOG Buffer
  23. 23. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . . . . . . . . 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . . . . . . . . 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . . . . . . . . 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . . . . . 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 XID Mapping Table 256K 1 Byte . . . . . . 020 021 022 023 016 017 018 019 012 013 014 015 008 009 010 011 004 005 006 007 000 001 002 003 ./pg_clog dirt_xmin t_xmax XID 001 XID 016 (1 트랜잭션 = 2bits) 4 * 256K = 1M TX 대응 0003 0002 0001 0000 #define TRANSACTION_STATUS_IN_PROGRESS 0x00 #define TRANSACTION_STATUS_COMMITTED 0x01 #define TRANSACTION_STATUS_ABORTED 0x02 #define TRANSACTION_STATUS_SUB_COMMITTED 0x03 src/include/access/clog.h
  24. 24. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. Dir CLOGPagePrecedes do_fsync shared ClogCtlData (SlruCtlData) Member Value num_slots 4 page_buffer 1c80 3c80 5c80 7c80 page_status 2 2 2 2 page_dirty 0 0 0 0 page_number 0x44 0x41 0x42 0x43 … … latest_page_number 0x44
  25. 25. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. 0000 (F) 0000(p) 0001(p) 0032(p) 0033(p) 0034(p) 0035(p) 0036(p) 0037(p) 0038(p) 0039(p) 0040(p) 0041(p) 0042(p) 0043(p) 0044(p) 0045(p) 0001 (F) Disk 0044 0041 0042 0043 Memory
  26. 26. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. ClogCtlData (SlruCtlData) shared (8 Bytes) do_fsync (8 Bytes) PagePrecedes (8 Bytes) Dir (14 Bytes) 000000e3dcc0 80 8b 4e e1 58 7f 00 00 01 00 00 00 00 00 00 00 >..N.X...........< 000000e3dcd0 24 a0 50 00 00 00 00 00 70 67 5f 63 6c 6f 67 00 >$.P.....pg_clog.< 000000e3dce0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................< 000000e3dcf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................< 000000e3dd00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................< 000000e3dd10 00 00 00 00 00 00 00 00 > . . . . . . . . <
  27. 27. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. ClogCtlData.shared (SlruShared) 7f58e14e8b80 00 d7 4c e1 58 7f 00 00 04 00 00 00 00 00 00 00 > . . L . X . . . . . . . . . . . < 7f58e14e8b90 10 8c 4e e1 58 7f 00 00 30 8c 4e e1 58 7f 00 00 >..N.X...0.N.X...< 7f58e14e8ba0 40 8c 4e e1 58 7f 00 00 48 8c 4e e1 58 7f 00 00 >@.N.X...H.N.X...< 7f58e14e8bb0 58 8c 4e e1 58 7f 00 00 68 8c 4e e1 58 7f 00 00 >X.N.X...h.N.X...< 7f58e14e8bc0 00 04 00 00 02 00 00 00 00 00 00 00 01 00 00 00 > . . . . . . . . . . . . . . . . < … 000000e3dcc0 80 8b 4e e1 58 7f 00 00 01 00 00 00 00 00 00 00 > . . N . X . . . . . . . . . . .< ControlLock (8 Bytes) num_slots (8 Bytes) page_buffer (8 Bytes) page_status (8 Bytes) page_dirty (8 Bytes) page_number (8 Bytes) page_lru_count (8 Bytes) group_lsn (8 Bytes) lsn_groups_per_page (4 Bytes) cur_lru_count (4 Bytes) latest_page_number (4 Bytes) lwlock_tranche_id (4 Bytes)
  28. 28. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. NAVER http://cafe.naver.com/playexem ITPUB (中) http://blog.itpub.net/31135309/ Wordpress https://playexem.wordpress.com/ Slideshare http://www.slideshare.net/playexem 교육 문의 edu@ex-em.com EXEM Research & Contents Team Youtube https://www.youtube.com/channel/UC5wK R_-A0eL_Pn_EMzoauJg Tudou (中) http://www.tudou.com/home/maxgauge/
  29. 29. © Copyrights 2001~2017 EXEM CO.,LTD. All Rights Reserved. 감사합니다

×