SlideShare una empresa de Scribd logo
1 de 102
Descargar para leer sin conexión
Jeremy Schneider, Amazon RDS
Wait! What’s going on inside
my database?
PostgreSQL and the Science of Database Performance
Updated: Sep 19, 2019
About PostgreSQL
1970: Mathematician Edgar F. Codd, working as researcher
for IBM, publishes “A Relational Model of Data for Large
Shared Data Banks”
1973: Michael Stonebraker and Eugene Wong at University
of California Berkeley seek funding and begin development
of a relational database called INGRES
1986: Michael Stonebraker and Lawrence A. Rowe at
University of California Berkeley publish “The Design of
POSTGRES” – a new database that is the successor to INGRES
1994: Andrew Yu and Jolly Chen at University of California
Berkeley add support for the SQL language
1996: Transition to non-university core team of volunteers,
official release under new name POSTGRESQL 1985
About PostgreSQL
About Database Performance
About Database Performance
About Database Performance
1990’s Manager:
“Dear DBA: Expert consultants
have taught us that if the Buffer
Cache Hit Ratio (BCHR) is below
90% then the system
immediately needs an expensive
tuning engagement.
Please report any databases that
have BCHR < 90%.”
Delfador Chibi by Peileppe
CC0
About Database Performance
1990’s Manager:
“Dear DBA: Expert consultants
have taught us that if the Buffer
Cache Hit Ratio (BCHR) is below
90% then the system
immediately needs an expensive
tuning engagement.
Please report any databases that
have BCHR < 90%.”
Delfador Chibi by Peileppe
CC0
Nørgaard, Mogens et al. Oracle Insights:
Tales of the Oak Table. Berkeley, CA:
Apress/OakTable Press, 2004. p76-77.
About Database Performance
About Database Performance
Millsap, Cary V. Optimizing Oracle Performance.
Sebastopol, CA: OReilly, 2003. p225, 240, 258-259
R = S + W
“How long
the SQL
takes to run”
See also:
• Shallahamer, Craig.
Forecasting Oracle
Performance. Berkeley,
CA: Apress, 2007.
About Database Performance
Active Session Sampling
(JB’s notebook, 2004)
Images & Quotes
Used With Permission
What about PostgreSQL?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
• 1990s: Database kernel instrumentation:
• Counters and tools to snapshot/compare them
• Events (log a message under certain circumstances)
• 1992: Unable to solve a performance problem, as a last resort,
engineers added event code in version 7.0.12 capable of emitting
log messages when the database waited for something
• First exposed in V$SESSION_WAIT and later in V$SESSION
(equivalent of pg_stat_activity)
• PostgreSQL built on concepts that had become standard across
the industry
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Millsap, Cary V. Optimizing Oracle Performance.
Sebastopol, CA: OReilly, 2003. p225, 240, 258-259
R = S + W
“How long
the SQL
takes to run”
See also:
• Shallahamer, Craig.
Forecasting Oracle
Performance. Berkeley,
CA: Apress, 2007.
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Active Session Sampling
(JB’s notebook, 2004)
Images & Quotes
Used With Permission
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
“But why are these events called wait events?
…
In short, when a session is not using the CPU, it may be
waiting for a resource, an action to complete, or simply
more work. Hence, events that are associated with all
such waits are known as wait events.”
Shee, Richmond, Kirtikumar Deshpande, and K. Gopalakrishnan. Oracle Wait Interface a Practical
Guide to Performance Diagnostics & Tuning. New York: London, 2004. p16
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
High-Level Idea:
Caveats:
• OS scheduling/runqueue
• Measurement overhead, OS kernel CPU time (e.g. I/O)
The database is WAITING any time when it’s not running on the CPU
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
psql> SELECT…
Idle
WaitingCpu
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Significant Commits: Version 9.6
• Aa65de0 – 11 Sep 2015 – Autogenerate lwlocknames.[c|h]
• 53be0b1 – 10 Mar 2016 – Heavy/Lightweight Locks, Buffer Pins
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Significant Commits: Version 9.6
• Aa65de0 – 11 Sep 2015 – Autogenerate lwlocknames.[c|h]
• 53be0b1 – 10 Mar 2016 – Heavy/Lightweight Locks, Buffer Pins
Version 10
• 6f3bd98 – 4 Oct 2016 – Latches & Sockets, Clients, Main Loops
• 249cf07 – 18 Mar 2017 – I/O
• Fc70a4b – 26 Mar 2017 – Background and Auxiliary Processes
Version 11
• 1804284 – 20 Dec 2017 – Parallel-Aware Hash Joins
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
src/include/pgstat.h
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
doxygen.postgresql.org
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
src/backend/postmaster/pgstat.c
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
src/backend/postmaster/pgstat.c
Mariinsky Theatre, St. Petersburg
by Sandra Cohen-Rose and Colin Rose (Montreal, Canada)
CC BY-SA
Wait Events
Gaps after migrating to Open Source/Community PostgreSQL
• Wait Event Counters and Cumulative Times
• Wait Event Arguments (object, block, etc)
• Comprehensive tracking of CPU time (POSIX rusage)
• Ability to find previous SQL for COMMIT/ROLLBACK
• Needed to identify which transaction is committing. (Other databases do not
update SQL text for COMMIT statement)
• On-CPU State
• SQL Execution Stage (parse/plan/execute/fetch)
• SQL Execution Plan Identifier in pg_stat_statements
• Current plan node
• Progress on long operations (e.g. large seqscan)
• Better runtime visibility into PLs
I can haz Wait Events?
Solving Problems with Wait Events in PostgreSQL
By Antony Griffiths (Flickr), CC BY
Solving Problems With Wait Events
pid | state | wait_event_type | wait_event | xact_runtime | query_short
--------+-------------+-----------------+---------------------+-------------------------+----------------------------------------------------
8135 | active | | | -00:00:00.000941 | autovacuum: VACUUM pghist.pg_stat_statements_20190
8168 | active | | | 00:00:00 | SELECT col1, col2,
| | | | |
108975 | | Activity | WalWriterMain | |
108976 | | Activity | AutoVacuumMain | |
108973 | | Activity | CheckpointerMain | |
108974 | | Activity | BgWriterMain | |
108979 | | Activity | LogicalLauncherMain | |
8185 | active | | | 00:00:00.07941 | autovacuum: VACUUM pghist.pg_stat_sys_indexes_2019
8212 | active | | | 00:00:00.349238 | autovacuum: VACUUM pghist.pg_stat_statements_20190
115699 | active | Lock | relation | 00:30:01.170404 | SELECT proc('param1')
103268 | active | IO | DataFileRead | 00:46:46.277548 | select count(*) from some_ones_table a , (select c
95936 | active | LWLock | buffer_io | 00:56:57.327904 | SELECT col1 FROM some_ones_table a, (SELECT col1 a
95935 | active | IO | DataFileRead | 00:56:57.328169 | SELECT col1 FROM some_ones_table a, (SELECT col1 a
95921 | active | LWLock | buffer_io | 00:56:57.393765 | SELECT col1 FROM some_ones_table a, (SELECT col1 a
56628 | active | IO | DataFileRead | 01:47:55.333596 | select col1 from some_ones_table WHERE err_id in (
53981 | active | IO | BufFileRead | 01:51:40.986659 | SELECT col1 FROM some_ones_table a, (SELECT asin a
49386 | active | LWLock | buffer_io | 01:58:13.166389 | SELECT count(*) FROM some_ones_table a, (SELECT co
29172 | active | IO | BufFileRead | 02:04:09.108342 | SELECT count(*) FROM some_ones_table a, (SELECT co
43208 | active | LWLock | buffer_io | 02:06:39.296499 | SELECT count(*) FROM some_ones_table a, (SELECT co
43207 | active | IO | DataFileRead | 02:06:39.29666 | SELECT count(*) FROM some_ones_table a, (SELECT co
31401 | active | IPC | MessageQueueReceive | 02:06:39.370239 | SELECT count(*) FROM some_ones_table a, (SELECT co
12387 | active | IO | DataFileRead | 02:46:50.262871 | select count(*) from some_ones_table a , (select c
12386 | active | IO | DataFileRead | 02:46:50.263142 | select count(*) from some_ones_table a , (select c
12385 | active | IO | DataFileRead | 02:46:50.266696 | select count(*) from some_ones_table a , (select c
83681 | active | BufferPin | BufferPin | 15:24:45.260184 | autovacuum: VACUUM schema1.some_ones_table (to prev
23340 | active | LWLock | buffer_io | 1 day 16:39:18.732685 | select column_001,column2,column3,column000004,
24074 | active | LWLock | buffer_io | 1 day 16:41:55.91496 | WITH this_subquery_01 as (select column_001,PIPELI
8110 | active | LWLock | buffer_io | 1 day 17:03:52.767838 | WITH this_subquery_01 as (select column_001,PIPEL
51767 | active | LWLock | buffer_io | 1 day 19:03:47.006302 | WITH this_subquery_01 as (select column_001,PIPEL
9217 | active | LWLock | buffer_io | 1 day 20:01:58.572314 | WITH this_subquery_01 as (select column_001,PIPEL
6086 | active | IO | DataFileRead | 1 day 20:06:08.584313 | WITH this_subquery_01 as (select column_001,PIPEL
115385 | active | LWLock | buffer_io | 1 day 20:35:27.617606 | WITH this_subquery_01 as (select column_001,PIPEL
94256 | idle in trx | Client | ClientRead | 27 days 02:33:48.940102 | select subquery00_.column_001 as COLUMN01_2_0_, a
(33 rows)
Solving Problems With Wait Events
Solving Problems With Wait Events
Active Session Sampling of Wait Events on PostgreSQL:
• Performance Insights on Amazon RDS
• RDS PostgreSQL 10+
• Aurora PostgreSQL 9.6+ (v10 Wait Events were backported)
• pg_wait_sampling and PoWa
• pgSentinal
• pgCenter
• (what am I missing?)
Solving Problems With Wait Events
SELECT sql_statement, count(*)
FROM pg_stat_activity_samples
WHERE date BETWEEN problem_start AND problem_end
GROUP BY sql_statement
ORDER BY count(*) DESC;
SELECT wait_event, count(*)
FROM pg_stat_activity_samples
WHERE sql_statement=top_problem_sql_statement
AND date BETWEEN problem_start AND problem_end
GROUP BY wait_event
ORDER BY count(*) DESC;
Solving Problems With Wait Events
Active Session Summary (Performance Insights, etc)
Top SQL & Top Wait Events
EXPLAIN ANALYZE with Buffers, IO timing, etc
Investigate STEP & WAIT Taking The Most Time
Thank you!
aws.amazon.com/rds/postgresql
Stuffed Elephant Store
Stuffed Elephant Store
A unique service that produces on-
demand stuffed elephants
Multiple sizes with long or short
fur
Any color the customer wants, as
long as its blue
Breibeest (Flickr), CC BY
Stuffed Elephant Store
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance Insights
See the query text
and the wait
events by query
Look back 7 days
or as much as 2
years to find
activity
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
explain.depesz.com
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Execution Plans
Explains how
Postgres plans to
execute a query
Shows the type of
operation, the
estimated cost,
and the estimated
number of rows
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System Catalogs
Contains the
structure of all
objects in the
database
Statistics views
shows usage of
the objects
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Stuffed Elephant Store
Indexes
PostgreSQL has a
rich set of index
types
Base functionality
can be enhanced
by specialized
extensions
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance Insights
Drill down into
time periods show
finer grain detail
The 3 minute view
shows 1 second
granularity
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Locking
Locks are held for
the duration of
the transaction
Locks can be held
on a table, row or
other object such
as transaction IDs
It’s so simple!
Solving Problems With Wait Events
Solving Problems With Wait Events
Aurora PostgreSQL:
• AWS Documentation covers
Aurora-Specific Wait
Events
• Shares Code With
Community PostgreSQL
(and merges regularly)
Solving Problems With Wait Events
DataFileRead, buffer_io
• I/O Read Path: Check SQL
execution plans, optimize
for fewer block reads.
XactSync, WALWrite
• I/O Write Path: Check
commit rate, volume of
change.
transactionid, relation, etc.
• Application Design: check
pg_locks during contention.
buffer_content
• Hot Block in Memory: check
foreign keys, optimize
contention (e.g. schema
redesign, fillfactor, etc).
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Individual Named LWLocks
Tranches for SLRU
Tranches for Shared Buffers
Individual Named Tranches
Solving Problems With Wait Events
Uppercase LWLocks: see lwlocknames.txt, search code directly
Lowercase LWLocks: tranches (arrays of locks for groups of objects)
• SLRUs – see SimpleLruInit() callers on doxygen
• Shared Buffers (buffer_content, buffer_io)
• Other Tranches – see RegisterLWLockTranches() in lwlock.c
(Heavyweight/SQL/Transaction) Locks: LockTagType enum in lock.h
• Strings come from matching structure LockTagTypeNames in lockfuncs.c
BufferPin: Vacuuming - see PG_WAIT_BUFFER_PIN refs on doxygen
Extension: FDWs, BG worker startup, etc - see PG_WAIT_EXTENSION refs
on doxygen
Activity, Client, IPC, Timeout and IO: enums, see pgstat.h
Multi-AZ [multiple physical locations]
Physical Backups
• Max allowed retention (35 days in RDS)
• Regular restore testing
Logical Backups
• Scheduled Exports/Dumps and Application Re-Drive
• Logical Replication
Huge Pages
Autovacuum Logging (RDS: need “force” setting)
• Logging Level = INFO
• Minimum duration = 10 seconds
PostgreSQL quarterly updates
• Stable minor releases for security and bug fixes (RDS)
• Some Aurora minors have new development work (Aurora)
• Remember to upgrade extensions; it’s not automatic
Connection Pooling
• Centralized and decentralized (app-tier) architectures exist
• Recycle server connections (e.g. server_lifetime)
Performance Insights [monitor active session waits]
• Keep the history
Enhanced Monitoring [OS monitoring]
• 10 second (or lower) granularity
Preload pg_stat_statements
Limit on temp usage by default (esp. Aurora)
• Log temp usage when close to the limit
Alarms
• Maximum used transaction IDs
• DBLoad [Average Active Sessions]
• Free disk space (RDS) / Free local storage (Aurora)
• Memory / swap
• Replica Lag (RDS)
PostgreSQL Happiness Hints version:
jer_s/2019-09-19
tinyurl.com/waitevents
Thank you!
aws.amazon.com/rds/postgresql
Solving Problems With Wait Events
244 GB DDR4 Memory
16 Physical Intel Xeon E5-2686 v4 (Broadwell) processors
Solving Problems With Wait Events
244 GB DDR4 Memory
16 Physical Intel Xeon E5-2686 v4 (Broadwell) processors
Solving Problems With Wait Events
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Individual Named LWLocks
Tranches for SLRU
Tranches for Shared Buffers
Individual Named Tranches
Solving Problems With Wait Events
Uppercase LWLocks: see lwlocknames.txt, search code directly
Lowercase LWLocks: tranches (arrays of locks for groups of objects)
• SLRUs – see SimpleLruInit() callers on doxygen
• Shared Buffers (buffer_content, buffer_io)
• Other Tranches – see RegisterLWLockTranches() in lwlock.c
(Heavyweight/SQL/Transaction) Locks: LockTagType enum in lock.h
• Strings come from matching structure LockTagTypeNames in lockfuncs.c
BufferPin: Vacuuming - see PG_WAIT_BUFFER_PIN refs on doxygen
Extension: FDWs, BG worker startup, etc - see PG_WAIT_EXTENSION refs
on doxygen
Activity, Client, IPC, Timeout and IO: enums, see pgstat.h
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events
Solving Problems With Wait Events

Más contenido relacionado

La actualidad más candente

PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
elliando dias
 

La actualidad más candente (20)

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Auditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPASAuditing and Monitoring PostgreSQL/EPAS
Auditing and Monitoring PostgreSQL/EPAS
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15
 
ProxySQL Cluster - Percona Live 2022
ProxySQL Cluster - Percona Live 2022ProxySQL Cluster - Percona Live 2022
ProxySQL Cluster - Percona Live 2022
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
Get to know PostgreSQL!
Get to know PostgreSQL!Get to know PostgreSQL!
Get to know PostgreSQL!
 
YugabyteDBの実行計画を眺める(NewSQL/分散SQLデータベースよろず勉強会 #3 発表資料)
YugabyteDBの実行計画を眺める(NewSQL/分散SQLデータベースよろず勉強会 #3 発表資料)YugabyteDBの実行計画を眺める(NewSQL/分散SQLデータベースよろず勉強会 #3 発表資料)
YugabyteDBの実行計画を眺める(NewSQL/分散SQLデータベースよろず勉強会 #3 発表資料)
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
 
ProxySQL High Avalability and Configuration Management Overview
ProxySQL High Avalability and Configuration Management OverviewProxySQL High Avalability and Configuration Management Overview
ProxySQL High Avalability and Configuration Management Overview
 
レプリケーション遅延の監視について(第40回PostgreSQLアンカンファレンス@オンライン 発表資料)
レプリケーション遅延の監視について(第40回PostgreSQLアンカンファレンス@オンライン 発表資料)レプリケーション遅延の監視について(第40回PostgreSQLアンカンファレンス@オンライン 発表資料)
レプリケーション遅延の監視について(第40回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
Pacemaker 操作方法メモ
Pacemaker 操作方法メモPacemaker 操作方法メモ
Pacemaker 操作方法メモ
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
 
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!祝!PostgreSQLレプリケーション10周年!徹底紹介!!
祝!PostgreSQLレプリケーション10周年!徹底紹介!!
 
あなたの知らないPostgreSQL監視の世界
あなたの知らないPostgreSQL監視の世界あなたの知らないPostgreSQL監視の世界
あなたの知らないPostgreSQL監視の世界
 
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQL14の pg_stat_statements 改善(第23回PostgreSQLアンカンファレンス@オンライン 発表資料)
 

Similar a Wait! What’s going on inside my database?

Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Kristofferson A
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Kristofferson A
 

Similar a Wait! What’s going on inside my database? (20)

Wait! What’s going on inside my database? (SCaLE 21x Update)
Wait! What’s going on inside my database? (SCaLE 21x Update)Wait! What’s going on inside my database? (SCaLE 21x Update)
Wait! What’s going on inside my database? (SCaLE 21x Update)
 
Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
 
Oow2016 review-db-dev-bigdata-BI
Oow2016 review-db-dev-bigdata-BIOow2016 review-db-dev-bigdata-BI
Oow2016 review-db-dev-bigdata-BI
 
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case StudyDissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
 
Big Data Seervices in Danaos Use Case
Big Data Seervices in Danaos Use CaseBig Data Seervices in Danaos Use Case
Big Data Seervices in Danaos Use Case
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management System
 
Presentation
PresentationPresentation
Presentation
 
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
Apache Spark v3.0.0
Apache Spark v3.0.0Apache Spark v3.0.0
Apache Spark v3.0.0
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraOpen Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Wait! What’s going on inside my database?

  • 1. Jeremy Schneider, Amazon RDS Wait! What’s going on inside my database? PostgreSQL and the Science of Database Performance Updated: Sep 19, 2019
  • 2. About PostgreSQL 1970: Mathematician Edgar F. Codd, working as researcher for IBM, publishes “A Relational Model of Data for Large Shared Data Banks” 1973: Michael Stonebraker and Eugene Wong at University of California Berkeley seek funding and begin development of a relational database called INGRES 1986: Michael Stonebraker and Lawrence A. Rowe at University of California Berkeley publish “The Design of POSTGRES” – a new database that is the successor to INGRES 1994: Andrew Yu and Jolly Chen at University of California Berkeley add support for the SQL language 1996: Transition to non-university core team of volunteers, official release under new name POSTGRESQL 1985
  • 6. About Database Performance 1990’s Manager: “Dear DBA: Expert consultants have taught us that if the Buffer Cache Hit Ratio (BCHR) is below 90% then the system immediately needs an expensive tuning engagement. Please report any databases that have BCHR < 90%.” Delfador Chibi by Peileppe CC0
  • 7. About Database Performance 1990’s Manager: “Dear DBA: Expert consultants have taught us that if the Buffer Cache Hit Ratio (BCHR) is below 90% then the system immediately needs an expensive tuning engagement. Please report any databases that have BCHR < 90%.” Delfador Chibi by Peileppe CC0
  • 8. Nørgaard, Mogens et al. Oracle Insights: Tales of the Oak Table. Berkeley, CA: Apress/OakTable Press, 2004. p76-77. About Database Performance
  • 9. About Database Performance Millsap, Cary V. Optimizing Oracle Performance. Sebastopol, CA: OReilly, 2003. p225, 240, 258-259 R = S + W “How long the SQL takes to run” See also: • Shallahamer, Craig. Forecasting Oracle Performance. Berkeley, CA: Apress, 2007.
  • 10. About Database Performance Active Session Sampling (JB’s notebook, 2004) Images & Quotes Used With Permission
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA
  • 13. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events
  • 14. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events • 1990s: Database kernel instrumentation: • Counters and tools to snapshot/compare them • Events (log a message under certain circumstances) • 1992: Unable to solve a performance problem, as a last resort, engineers added event code in version 7.0.12 capable of emitting log messages when the database waited for something • First exposed in V$SESSION_WAIT and later in V$SESSION (equivalent of pg_stat_activity) • PostgreSQL built on concepts that had become standard across the industry
  • 15. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events Millsap, Cary V. Optimizing Oracle Performance. Sebastopol, CA: OReilly, 2003. p225, 240, 258-259 R = S + W “How long the SQL takes to run” See also: • Shallahamer, Craig. Forecasting Oracle Performance. Berkeley, CA: Apress, 2007.
  • 16. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events Active Session Sampling (JB’s notebook, 2004) Images & Quotes Used With Permission
  • 17. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events “But why are these events called wait events? … In short, when a session is not using the CPU, it may be waiting for a resource, an action to complete, or simply more work. Hence, events that are associated with all such waits are known as wait events.” Shee, Richmond, Kirtikumar Deshpande, and K. Gopalakrishnan. Oracle Wait Interface a Practical Guide to Performance Diagnostics & Tuning. New York: London, 2004. p16
  • 18. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events High-Level Idea: Caveats: • OS scheduling/runqueue • Measurement overhead, OS kernel CPU time (e.g. I/O) The database is WAITING any time when it’s not running on the CPU
  • 19. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events psql> SELECT… Idle WaitingCpu
  • 20. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events
  • 21. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events
  • 22. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events
  • 23. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events Significant Commits: Version 9.6 • Aa65de0 – 11 Sep 2015 – Autogenerate lwlocknames.[c|h] • 53be0b1 – 10 Mar 2016 – Heavy/Lightweight Locks, Buffer Pins
  • 24. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events Significant Commits: Version 9.6 • Aa65de0 – 11 Sep 2015 – Autogenerate lwlocknames.[c|h] • 53be0b1 – 10 Mar 2016 – Heavy/Lightweight Locks, Buffer Pins Version 10 • 6f3bd98 – 4 Oct 2016 – Latches & Sockets, Clients, Main Loops • 249cf07 – 18 Mar 2017 – I/O • Fc70a4b – 26 Mar 2017 – Background and Auxiliary Processes Version 11 • 1804284 – 20 Dec 2017 – Parallel-Aware Hash Joins
  • 25. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events src/include/pgstat.h
  • 26. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events doxygen.postgresql.org
  • 27. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events src/backend/postmaster/pgstat.c
  • 28. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events src/backend/postmaster/pgstat.c
  • 29. Mariinsky Theatre, St. Petersburg by Sandra Cohen-Rose and Colin Rose (Montreal, Canada) CC BY-SA Wait Events Gaps after migrating to Open Source/Community PostgreSQL • Wait Event Counters and Cumulative Times • Wait Event Arguments (object, block, etc) • Comprehensive tracking of CPU time (POSIX rusage) • Ability to find previous SQL for COMMIT/ROLLBACK • Needed to identify which transaction is committing. (Other databases do not update SQL text for COMMIT statement) • On-CPU State • SQL Execution Stage (parse/plan/execute/fetch) • SQL Execution Plan Identifier in pg_stat_statements • Current plan node • Progress on long operations (e.g. large seqscan) • Better runtime visibility into PLs
  • 30. I can haz Wait Events? Solving Problems with Wait Events in PostgreSQL By Antony Griffiths (Flickr), CC BY
  • 31. Solving Problems With Wait Events pid | state | wait_event_type | wait_event | xact_runtime | query_short --------+-------------+-----------------+---------------------+-------------------------+---------------------------------------------------- 8135 | active | | | -00:00:00.000941 | autovacuum: VACUUM pghist.pg_stat_statements_20190 8168 | active | | | 00:00:00 | SELECT col1, col2, | | | | | 108975 | | Activity | WalWriterMain | | 108976 | | Activity | AutoVacuumMain | | 108973 | | Activity | CheckpointerMain | | 108974 | | Activity | BgWriterMain | | 108979 | | Activity | LogicalLauncherMain | | 8185 | active | | | 00:00:00.07941 | autovacuum: VACUUM pghist.pg_stat_sys_indexes_2019 8212 | active | | | 00:00:00.349238 | autovacuum: VACUUM pghist.pg_stat_statements_20190 115699 | active | Lock | relation | 00:30:01.170404 | SELECT proc('param1') 103268 | active | IO | DataFileRead | 00:46:46.277548 | select count(*) from some_ones_table a , (select c 95936 | active | LWLock | buffer_io | 00:56:57.327904 | SELECT col1 FROM some_ones_table a, (SELECT col1 a 95935 | active | IO | DataFileRead | 00:56:57.328169 | SELECT col1 FROM some_ones_table a, (SELECT col1 a 95921 | active | LWLock | buffer_io | 00:56:57.393765 | SELECT col1 FROM some_ones_table a, (SELECT col1 a 56628 | active | IO | DataFileRead | 01:47:55.333596 | select col1 from some_ones_table WHERE err_id in ( 53981 | active | IO | BufFileRead | 01:51:40.986659 | SELECT col1 FROM some_ones_table a, (SELECT asin a 49386 | active | LWLock | buffer_io | 01:58:13.166389 | SELECT count(*) FROM some_ones_table a, (SELECT co 29172 | active | IO | BufFileRead | 02:04:09.108342 | SELECT count(*) FROM some_ones_table a, (SELECT co 43208 | active | LWLock | buffer_io | 02:06:39.296499 | SELECT count(*) FROM some_ones_table a, (SELECT co 43207 | active | IO | DataFileRead | 02:06:39.29666 | SELECT count(*) FROM some_ones_table a, (SELECT co 31401 | active | IPC | MessageQueueReceive | 02:06:39.370239 | SELECT count(*) FROM some_ones_table a, (SELECT co 12387 | active | IO | DataFileRead | 02:46:50.262871 | select count(*) from some_ones_table a , (select c 12386 | active | IO | DataFileRead | 02:46:50.263142 | select count(*) from some_ones_table a , (select c 12385 | active | IO | DataFileRead | 02:46:50.266696 | select count(*) from some_ones_table a , (select c 83681 | active | BufferPin | BufferPin | 15:24:45.260184 | autovacuum: VACUUM schema1.some_ones_table (to prev 23340 | active | LWLock | buffer_io | 1 day 16:39:18.732685 | select column_001,column2,column3,column000004, 24074 | active | LWLock | buffer_io | 1 day 16:41:55.91496 | WITH this_subquery_01 as (select column_001,PIPELI 8110 | active | LWLock | buffer_io | 1 day 17:03:52.767838 | WITH this_subquery_01 as (select column_001,PIPEL 51767 | active | LWLock | buffer_io | 1 day 19:03:47.006302 | WITH this_subquery_01 as (select column_001,PIPEL 9217 | active | LWLock | buffer_io | 1 day 20:01:58.572314 | WITH this_subquery_01 as (select column_001,PIPEL 6086 | active | IO | DataFileRead | 1 day 20:06:08.584313 | WITH this_subquery_01 as (select column_001,PIPEL 115385 | active | LWLock | buffer_io | 1 day 20:35:27.617606 | WITH this_subquery_01 as (select column_001,PIPEL 94256 | idle in trx | Client | ClientRead | 27 days 02:33:48.940102 | select subquery00_.column_001 as COLUMN01_2_0_, a (33 rows)
  • 32. Solving Problems With Wait Events
  • 33. Solving Problems With Wait Events Active Session Sampling of Wait Events on PostgreSQL: • Performance Insights on Amazon RDS • RDS PostgreSQL 10+ • Aurora PostgreSQL 9.6+ (v10 Wait Events were backported) • pg_wait_sampling and PoWa • pgSentinal • pgCenter • (what am I missing?)
  • 34. Solving Problems With Wait Events SELECT sql_statement, count(*) FROM pg_stat_activity_samples WHERE date BETWEEN problem_start AND problem_end GROUP BY sql_statement ORDER BY count(*) DESC; SELECT wait_event, count(*) FROM pg_stat_activity_samples WHERE sql_statement=top_problem_sql_statement AND date BETWEEN problem_start AND problem_end GROUP BY wait_event ORDER BY count(*) DESC;
  • 35. Solving Problems With Wait Events Active Session Summary (Performance Insights, etc) Top SQL & Top Wait Events EXPLAIN ANALYZE with Buffers, IO timing, etc Investigate STEP & WAIT Taking The Most Time
  • 38. Stuffed Elephant Store A unique service that produces on- demand stuffed elephants Multiple sizes with long or short fur Any color the customer wants, as long as its blue Breibeest (Flickr), CC BY
  • 40. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 41. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 42. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 43. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 44. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 45. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 46. Performance Insights See the query text and the wait events by query Look back 7 days or as much as 2 years to find activity
  • 47. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 48. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 49. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. explain.depesz.com
  • 50. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 51. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 52. Execution Plans Explains how Postgres plans to execute a query Shows the type of operation, the estimated cost, and the estimated number of rows
  • 53. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 54. System Catalogs Contains the structure of all objects in the database Statistics views shows usage of the objects
  • 55. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 56. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 57. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 58. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 60. Indexes PostgreSQL has a rich set of index types Base functionality can be enhanced by specialized extensions
  • 61. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 62. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 63. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 64. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 65. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 66. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 67. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 68. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 69. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 70. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 71. Performance Insights Drill down into time periods show finer grain detail The 3 minute view shows 1 second granularity
  • 72. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 73. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 74. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 75. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 76. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 77. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 78. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 79. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 80. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 81. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 82. Locking Locks are held for the duration of the transaction Locks can be held on a table, row or other object such as transaction IDs
  • 83. It’s so simple! Solving Problems With Wait Events
  • 84. Solving Problems With Wait Events Aurora PostgreSQL: • AWS Documentation covers Aurora-Specific Wait Events • Shares Code With Community PostgreSQL (and merges regularly)
  • 85. Solving Problems With Wait Events DataFileRead, buffer_io • I/O Read Path: Check SQL execution plans, optimize for fewer block reads. XactSync, WALWrite • I/O Write Path: Check commit rate, volume of change. transactionid, relation, etc. • Application Design: check pg_locks during contention. buffer_content • Hot Block in Memory: check foreign keys, optimize contention (e.g. schema redesign, fillfactor, etc).
  • 86. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Individual Named LWLocks Tranches for SLRU Tranches for Shared Buffers Individual Named Tranches
  • 87. Solving Problems With Wait Events Uppercase LWLocks: see lwlocknames.txt, search code directly Lowercase LWLocks: tranches (arrays of locks for groups of objects) • SLRUs – see SimpleLruInit() callers on doxygen • Shared Buffers (buffer_content, buffer_io) • Other Tranches – see RegisterLWLockTranches() in lwlock.c (Heavyweight/SQL/Transaction) Locks: LockTagType enum in lock.h • Strings come from matching structure LockTagTypeNames in lockfuncs.c BufferPin: Vacuuming - see PG_WAIT_BUFFER_PIN refs on doxygen Extension: FDWs, BG worker startup, etc - see PG_WAIT_EXTENSION refs on doxygen Activity, Client, IPC, Timeout and IO: enums, see pgstat.h
  • 88. Multi-AZ [multiple physical locations] Physical Backups • Max allowed retention (35 days in RDS) • Regular restore testing Logical Backups • Scheduled Exports/Dumps and Application Re-Drive • Logical Replication Huge Pages Autovacuum Logging (RDS: need “force” setting) • Logging Level = INFO • Minimum duration = 10 seconds PostgreSQL quarterly updates • Stable minor releases for security and bug fixes (RDS) • Some Aurora minors have new development work (Aurora) • Remember to upgrade extensions; it’s not automatic Connection Pooling • Centralized and decentralized (app-tier) architectures exist • Recycle server connections (e.g. server_lifetime) Performance Insights [monitor active session waits] • Keep the history Enhanced Monitoring [OS monitoring] • 10 second (or lower) granularity Preload pg_stat_statements Limit on temp usage by default (esp. Aurora) • Log temp usage when close to the limit Alarms • Maximum used transaction IDs • DBLoad [Average Active Sessions] • Free disk space (RDS) / Free local storage (Aurora) • Memory / swap • Replica Lag (RDS) PostgreSQL Happiness Hints version: jer_s/2019-09-19 tinyurl.com/waitevents
  • 90. Solving Problems With Wait Events 244 GB DDR4 Memory 16 Physical Intel Xeon E5-2686 v4 (Broadwell) processors
  • 91. Solving Problems With Wait Events 244 GB DDR4 Memory 16 Physical Intel Xeon E5-2686 v4 (Broadwell) processors
  • 92. Solving Problems With Wait Events
  • 93. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Individual Named LWLocks Tranches for SLRU Tranches for Shared Buffers Individual Named Tranches
  • 94. Solving Problems With Wait Events Uppercase LWLocks: see lwlocknames.txt, search code directly Lowercase LWLocks: tranches (arrays of locks for groups of objects) • SLRUs – see SimpleLruInit() callers on doxygen • Shared Buffers (buffer_content, buffer_io) • Other Tranches – see RegisterLWLockTranches() in lwlock.c (Heavyweight/SQL/Transaction) Locks: LockTagType enum in lock.h • Strings come from matching structure LockTagTypeNames in lockfuncs.c BufferPin: Vacuuming - see PG_WAIT_BUFFER_PIN refs on doxygen Extension: FDWs, BG worker startup, etc - see PG_WAIT_EXTENSION refs on doxygen Activity, Client, IPC, Timeout and IO: enums, see pgstat.h
  • 95. Solving Problems With Wait Events
  • 96. Solving Problems With Wait Events
  • 97. Solving Problems With Wait Events
  • 98. Solving Problems With Wait Events
  • 99. Solving Problems With Wait Events
  • 100. Solving Problems With Wait Events
  • 101. Solving Problems With Wait Events
  • 102. Solving Problems With Wait Events