pg_statsinfo and pg_stats_reporter are open source PostgreSQL monitoring and reporting tools developed by NTT. pg_statsinfo collects database statistics and activity from PostgreSQL servers and operating system resources and stores them in a repository database. pg_stats_reporter visualizes the collected statistics in reports. The tools provide in-depth monitoring of database performance and help identify optimization opportunities.
Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~
1. Introduction of
pg_̲statsinfo and pg_̲stats_̲reporter
~∼ Statistics Reporting Tool for DBA ~∼
NTT Open Source Software Center
Mitsumasa KONDO
Copyright(c)2013 NTT Corp. All Rights Reserved.
2. About Me
• Official
C ompany
N ame
• Nippon
Telegraph
and
Telephone
C orporation
• My
B elonging
• Service
innovation
Laboratory,
S oftware
Innovation
C enter
Researcher
• My
w ork
• Middleware
development
for
PostgreSQL
• pg_statsinfo,
pg_stats_reporter
• High
A vailability
PostgreSQL
C luster
using
replication
with
Pacemaker
• PostgreSQL
community
development
• Improvement
of
disk
IO
bottle
neck
• Past
w ork
• Data
mining,
Natural
Language
Processing,
Machine
Learning,
Recommendation,
Information
Retrieval
• I
have
already
been
good
at
them
than
databaseJ
• Hobby
• Photography
• Pure
A udio
Copyright(c)2013 NTT Corp. All Rights Reserved.
2
3. Todayʼ’s Introduction Software
• pg_statsinfo
• Monitor and Collect PostgreSQL Statistics and Activities
• pg_stats_reporter
• Visualize PostgreSQL Statistics and Activities getting from
pg_̲statsinfo
Creating
report
pg_statsinfo
DB Server A
pg_statsinfo
DB Server B
Database
Statistics
and
Activity
pg_stats_reporter
Store of DB
statistics
pg_statsinfo
DB Server C
Repository
Database
Sample report which was created by
pg_stats_reporter
Copyright(c)2013 NTT Corp. All Rights Reserved.
3
4. Contents
• pg_statsinfo
~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~
• What is pg_̲statsinfo ?
• Feature Introduction
• Demo
• pg_stats_reporter
~
V isualize
DB
S tatistics
and
A ctivities
~
• What is pg_̲stats_̲reporter ?
• Feature introduction
• Demo
• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd
pg_stats_reporter
• Introduction of DBT-‐‑‒2
• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter
• For more performance
Copyright(c)2013 NTT Corp. All Rights Reserved.
4
5. Contents
• pg_statsinfo
~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~
• What is pg_̲statsinfo ?
• Feature Introduction
• Demo
• pg_stats_reporter
~
V isualize
DB
S tatistics
and
A ctivities
~
• What is pg_̲stats_̲reporter ?
• Feature introduction
• Demo
• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd
pg_stats_reporter
• Introduction of DBT-‐‑‒2
• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter
• For more performance
Copyright(c)2013 NTT Corp. All Rights Reserved.
5
6. What is pg_̲statsinfo ?
• Monitoring
a nd
C ollecting
PostgreSQL
S tatistics
a nd
A ctivities
•
•
•
•
Collecting statistics and activities
All tables in pg_̲catalog schema
pg_̲log information
OS resources
• Other Features
•
•
•
•
Create Report by command line
Alert and Monitoring function
Log management function
Auto repositoryDB management
• Other
r elative
i nformation
• BSD License
• Latest version is 2.5.0
Collective Database Statistics
• http://pgfoundry.org /frs /?group_̲id=1000422
• Working on PostgreSQL 9.3!
• Web online manual is here
• http://pgstatsinfo.projects.pgfoundry.org /pg_̲statsinfo-‐‑‒ja.html
Copyright(c)2013 NTT Corp. All Rights Reserved.
6
7. Architecture of pg_̲statsinfo
• Programing
L anguage
• C
• Starting
a nd
P re-‐Setting
m ethod
• Start pg_̲statsinfo via shared_̲preload_̲library
• Add postgresql.conf to pg_̲statsinfo configuration, then it
can start normally in PostgreSQL.
• System
C onfiguration
• Install pg_̲statsinfo in monitoring instance
• Not need to install in repository database instance
• Monitoring instance and repository database can set
together incetance
pg_catalog
pg_log
OS resources
pg_statsinfod
Collect and send
database statistics
(Snapshot)
Statistics of
database
Monitoring
Repository
instance Copyright(c)2013 NTT Corp. All Rights Reserved.
database
7
8. Features of pg_̲statsinfo 1/5
• Collect
s tatistics
a nd
a ctivities
i n
PostgreSQL
• All information gathering PostgreSQLʼ’s statistics collector
(ex. pg_̲catalog)
• Detail of statistics collector, please see PostgreSQL
documentJ
• http://www.postgresql.jp /document /9.2/html /monitoring-‐‑‒stats.html
• Get statistcs as snapshot at uniformity time
• Default every 10 minute
• Analyze pg_̲log and get activities from logs
• Get activities which only output pg_̲log
• Checkpoint activities
• VACUUM activities
• Get OS resources information in /proc
• Get every 5 seconds in sampling, when get snapshot, insert
average values of sampling
• CPU usage information(idle, iowait, system, user, Load Average)
• Memory usage information(memfree, buffers, cached, swap, dirty)
• Disk usage information(IO size, IO time, usage size of disk)
Copyright(c)2013 NTT Corp. All Rights Reserved.
8
9. Features of pg_̲statsinfo 2/5
• Create
r eports
o n
c ommand
l ine
• Output text format report on command line
• Example) Database admin or SQL Engineer who wants to
see database statistics
• Cover almost all report item created by pg_̲stats_̲reporter
Command example: Create report for all monitor instances on 2013-10-1 to now
$ pg_statsinfo -U postgres -B 2013-10-01 -r ALL | less
Copyright(c)2013 NTT Corp. All Rights Reserved.
9
11. Features of pg_̲statsinfo 3/5
• Auto
m aintenance
r epository
d atabase
f eature
• Delete statistics that stored in repository database
automatically
• Pg_̲statsinfo stored data that are used partitioning method per day.
• So it can use TRUNCATE to delete old data
• Delete data is faster and lower cost
• Note
• When we use in multi monitor instance, giving priority to
shortest maintenance period of stored data configuration
Repository
database
Maintenance period
of stored data config
pg_statsinfo
Get and Send
1 week
DB server A
database
statistics
Maintenance period
of stored data config
pg_statsinfo
2 weeks
DB server B
Default maintenance
period of stored
Store of data is 1 weeks
database
statistics
Copyright(c)2013 NTT Corp. All Rights Reserved.
11
12. Features of pg_̲statsinfo 4/5
• Log
management
feature
• Easy to manage PostgreSQLʼ’s log
• Log filtering feature
• Can set log level in pg_̲statsinfo, it means that we can having two log level
• example)PostgreSQLʼ’s log level is lower setting to save detail information, and
pg_̲statsinfo log level is higher setting to easy to read in daily
• This feature can fix log file name(ex. postgresql.log) It can use in monitoring log
sof tware.
• Multi output log feature
• Can output syslog and pg_̲log
• Change log level feature
• If you want to change log level in especially log message, we can change it
• ex)change log level INFO to LOG in especially log message
• Log compression and managing feature
• Compress old logs and manage automatically
pg_statsinfod
pg_log
(csv format)
log formulation
Log by statsinfo
(postgresql.log)
Flow of extraction statistics from pg_log
Copyright(c)2013 NTT Corp. All Rights Reserved.
12
13. Features of pg_̲statsinfo 5/5
• Alert
a nd
M onitoring
F unction
( Trigger
F unction)
• Output alert log when over the alert thresholds in database
• usage)monitor alert log by monitoring software
• Alert function is executed in every snapshot
• Default setting is under following, set property value on
your server
• Setting method is UPDATE SQL for statsrepo.alert table
Alert configuration table
colum
name
default
instid
-
rollback_tps
100
commit_tps
1000
Number of commit per seconds (sec)
garbage_size
20000
Garbage records size in the table(%)
garbage_percent
30
Garbage records percentage in the database(%)
garbage_percent_table
30
Garbage records percentage in the table(%)
response_avg
10
average response time in the query (sec)
response_worst
60
Worst response time in the query (sec)
enable_alert
explanation
Target instance ID
Number of rollback (sec)
true
Enable alert function
Copyright(c)2013 NTT Corp. All Rights Reserved.
13
14. How to install pg_̲statsinfo ?
1. Install RPM file’s
$ su
# rpm –ivh pg_statsinfo-2.50-1.pg93.rhel6.x86_64.rpm
2. Add configuration to postgresql.conf
#minimum configuration
shared_preload_libraries = ‘pg_statsinfo’ # pre-load library setting
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # configuration of log file’s (must need)
3. Start PostgreSQL in normally
$ pg_ctl –D data start
4. If we see under following log messages, install was succeed !
server starting
LOG: loaded library "pg_statsinfo"
LOG: pg_statsinfo launcher started
LOG: start
LOG: installing schema: statsinfo
LOG: installing schema: statsrepo_partition
How to install pg_statsinfo is indicated in Web manual ! J
http://pgstatsinfo.projects.pgfoundry.org/pg_statsinfo-ja.html#install
Copyright(c)2013 NTT Corp. All Rights Reserved.
14
15. Demo of pg_̲statsinfo
1.Install
2.Confirmation
o f
I nstall
3.Collect
Database
S tatistics
and
A ctivities
(Snapshot)
4.Create
Report
Copyright(c)2013 NTT Corp. All Rights Reserved.
15
16. TIPS of pg_̲statsinfo
• One
s napshot
s ize
i s
3 00kB
~
8 00kB
• Be careful disk full by snapshots!
• Software
i nstalling
d egradation
i s
a lmost
n othing
• But little bit happen. In DBT-‐‑‒2 benchmark, we confirm 2%
degradation.
• If
y ou’d
l ike
t o
s eparate
r epository
s erver,
s et
“pg_statsinfo.repository_server”
i n
p ostgresql.conf
.
• Default setting is ʻ‘host=localhost port=5432ʼ’
• If
y ou
u se
p assword
i n
r epository
d atabase,
s et
/ var/lib/
pgsql/.pgpass
• pg_̲statsinfo works on postgres user
Copyright(c)2013 NTT Corp. All Rights Reserved.
16
17. Contents
• pg_statsinfo
~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~
• What is pg_̲statsinfo ?
• Feature Introduction
• Demo
• pg_stats_reporter
~
V isualize
DB
S tatistics
and
A ctivities
~
• What is pg_̲stats_̲reporter ?
• Feature introduction
• Demo
• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd
pg_stats_reporter
• Introduction of DBT-‐‑‒2
• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter
• For more performance
Copyright(c)2013 NTT Corp. All Rights Reserved.
17
18. What is pg_̲stats_̲reporter ?
• Visualization
PostgreSQL
s tatistics
a nd
a ctivities
g etting
f rom
pg_statsinfo
• Report items
• Transaction situation
• Size of Database
• OS resources
• Amount of WAL output
• Replication state
• Deadlock information
• Successor software of
pg_̲reporter
• Extra
i nformation
• BSD License
• Latest version is 2.0.0
Report of pg_stats_reporter
• http://pgfoundry.org /frs /?group_̲id=1000422
• Detail online manual is here
• http://pgstatsinfo.projects.pgfoundry.org /pg_̲stats_̲reporter-‐‑‒ja.html
Copyright(c)2013 NTT Corp. All Rights Reserved.
18
19. Architecture of pg_̲stats_̲reporter
• Software
• Apache + PHP + PostgreSQL
• Only PHP + PostgreSQL combination is OK
• Need PostgreSQL 8.3 later
• Programing
L anguage
• PHP + javascript + SQL
• Using
L ibrary
• PHP framework
• Smarty
• User Interface
• jQuery, jQuery UI, tablesorter, Superfish
• Creating graph
• dygraphs, jqPlot
Copyright(c)2013 NTT Corp. All Rights Reserved.
19
20. How to Create Report ? 1/2
• By
W ab
B rowser
• Only a few clicks for creating report.
② Push
“create new
report” button
③ Set term and
time of report
① Select
database instance
for reporting
Copyright(c)2013 NTT Corp. All Rights Reserved.
20
21. How to Create Report ? 2/2
• By
c ommand
l ine
• It works on phpʼ’s stand alone mode.
• Usage scene
• Create report in command line.
• Create reports by crond in regular intervals.
• If you use only command line mode, Apache wasnʼ’t
needed
• If you have security policy which cannot install Apache
• Need to save reports in long term
• Repository database is saved until certain terms
• Created reports arenʼ’t erased.
Command usage: Create report in 10/1 to 10/8 at report_dir
$ pg_stats_reporter -B 2013-10-01 -E 2013-10-08 -O report_dir
[LOG] Report file created: sample_localhost_5432_1_20131008-1419_20131008-1945.html
Copyright(c)2013 NTT Corp. All Rights Reserved.
21
22. How to Create Report ? 2/2
• Index
o f
Report
f eature
• Create report and index of reports in report directory
• It is easy to see and sort out reports
Directory of
Report
Libraly of
pg_stats_reporter
Index.html
Reports which
were created past
Index of report
Report HTML 1
Report HTML2
Copyright(c)2013 NTT Corp. All Rights Reserved.
22
23. How to install pg_̲stats_̲reporter ?
1. Install pg_stats_reporter RPM and dependency RPMs
$ su
# rpm –ivh httpd-2.2.15-15.el6_2.1.x86_64.rpm
php-5.3.3-3.el6_2.8.x86_64.rpm
php-common-5.3.3-3.el6_2.8.x86_64.rpm
php-pgsql-5.3.3-3.el6_2.8.x86_64.rpm
php-intl-5.3.3-3.el6_2.8.x86_64.rpm
pg_stats_reporter-1.0.0-1.el6.noarch.rpm
2. Set pg_stats_reporter.ini(configuration file) (default setting is under following)
# vim /etc/pg_stats_reporter.ini
----- configuration of repository database -----
host = localhost
port = 5432
dbname = postgres
username = postgres
password =
3. Start Apache HTTP server
# service httpd start
4. Access under following URL
http://localhost/pg_stats_reporter/pg_stats_reporter.php
Please set
SELINUX disable!!
How to install pg_stats_reporter is indicated in Web manual ! J
http://pgstatsinfo.projects.pgfoundry.org/pg_stats_reporter-ja.html#install
Copyright(c)2013 NTT Corp. All Rights Reserved.
23
25. TIPS of pg_̲stats_̲reporter
• Android
a nd
i Pad
a re
r eady
• It
i s
b ased
o n
j QueryUI
l ibrary,
s o
w e
c an
e asy
t o
c hange
interface
d esign
( mostly
c olor)
• Logo picture can be also changed with file replaced
• It
c an
s elect
r eport
i tems
o n
r eports
• If weʼ’d like to, set /etc /pg_̲stats_̲reporter.ini with your
needed report item
• For
S ecurity
• We can use .httpaccess
• Apacheʼ’s security technic can use in same
Copyright(c)2013 NTT Corp. All Rights Reserved.
25
26. Contents
• pg_statsinfo
~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~
• What is pg_̲statsinfo ?
• Feature Introduction
• Demo
• pg_stats_reporter
~
V isualize
DB
S tatistics
and
A ctivities
~
• What is pg_̲stats_̲reporter ?
• Feature introduction
• Demo
• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd
pg_stats_reporter
• Introduction of DBT-‐‑‒2
• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter
• For more performance
Copyright(c)2013 NTT Corp. All Rights Reserved.
26
27. What is DBT-‐‑‒2?
• TPC-‐C
b enchmark
s oftware
t hat
d eveloped
b y
O pen
Source
D evelopment
L abs(OSDL)
• Shopping simulation in parts wholesaler
• http://www.tpc.org /tpcc /
• Benchmark score is calculated by only response in
uniformity time
• Response time is very important!
• IO bottle-‐‑‒neck benchmark
• Mainly
b enchmark
p arameter
• warehouse
• Database size parameter
• Increase one hundred thousands record per adding 1 parameter
• Mainly used coordination size of database
• TPW
• Transaction per warehouse
• Prepared clients corresponding warehouse size, Default 10
• If we set lower TPW, it will be CPU bottle-‐‑‒necked benchmark
Copyright(c)2013 NTT Corp. All Rights Reserved.
27
28. Transaction Tendency in DBT-‐‑‒2
• Mainly
b ottle-‐neck
• Random read /write
• Almost SQL plans are index scan
• Random read /write performance and cache or buffer
replace performance are important
• Parallel execution performance is also important
• PostgreSQL is better than other RDBMSJ
• Other
f eatures
• Plan of SQLs are very simple
• Most of SQLs are only index scan access.
• Exist ideal Benchmark score
• If DB response all transactions in limit time, it is be ideal
score
• Limit of performance is memory 2x equals database
size.
• Amount of WAL output is less than pgbench, WAL is not
bottle-‐‑‒neck.
Copyright(c)2013 NTT Corp. All Rights Reserved.
28
29. Test Server and Settings of postgresql.conf
Server
HP DL360 G7
CPU
Xeon E5640 2.66GHz (1P/4C)
Memory
DDR3-10600R-9 18GB
RAID card
P410i / 256MB cache
Disk
4 x 146GB(1.5krpm) RAID 1 + 0
postgresql.conf
(mainly
changed
parameter)
max_connections = 300
shared_buffers = 2458MB
work_mem = 1MB
maintenance_work_mem = 64MB
fsync = on
wal_sync_method = fdatasync
full_page_writes = on
wal_buffers = -1
archive_mode = on
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.7
random_page_cost = 2.0
effective_cache_size = 9GB
default_statistics_target = 10
log_destination = 'syslog’
autovacuum = on
Wherehouse
size
= Copyright(c)2013 NTT Corp. All Rights Reserved.PW
=
10
320(database
size
is
about
40GB)
and
T
29
30. Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 1/5
• Transaction
Situation
• It was seen fluctuates transactions. It is because some benchmark
specifications and some implementation dependent in PostgreSQL
• Lower performance in executing CHECKPOINT
• CHECKPOINT was mainly caused by checkpoint_̲timeout
• postgresql.conf sets checkpoint_̲timeout = 15min and checkpoint_̲segments = 300
Copyright(c)2013 NTT Corp. All Rights Reserved.
30
31. Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 2/5
• Amount
o f
WAL
o utput
• Output 4.6GB WAL in data load to benchmark finished
• In data load, Maximum WAL speed is 54MB/sec
• In executing benchmark test, Maximum WAL speed is 12MB/
sec
• When starting CHECKPOINT, WAL Speed is higher, it is
because “full page write”.
Copyright(c)2013 NTT Corp. All Rights Reserved.
31
32. Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 3/5
• CPU
u sage
• Iowait is most, next is idle (It indicates IO bottle-‐‑‒neck situation.)
• Part of final CHECKPOINT causes high Load Average
• It is because executing ugly consecutive fsync().
• PostgreSQL CHECKPOINT logic is not goodL
Copyright(c)2013 NTT Corp. All Rights Reserved.
32
33. Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 4/5
• Update
a nd
h eavily
a ccess
Tables
• HOT(Heap on Tuple) is good working!
• order_̲line table and stock table have many access
• Each tableʼ’s Cache hit rate are very high, but… (Is it really?L)
Copyright(c)2013 NTT Corp. All Rights Reserved.
33
34. Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 5/5
• Query
e xecuted
s ituation
• Queries which have complicated filter phrase is slow
• Unexpected, COMMIT assumes long time!
• It is because long transaction COMMIT needs lot of WAL (WAL
buffer writing)
• Final CHECKPOINT fsync() phase makes queries slower
Copyright(c)2013 NTT Corp. All Rights Reserved.
34
35. For More Performance
• Use
d irect_cp
i n
a rchive
c opy
c ommand
• When we use archive mode in PostgreSQL, cp command
consume large amount of waste file cache, and it is
caused lower performance
• BSD License Software
• http://directcp.projects.pgfoundry.org /index.html
• Use
S SD
• In general, database bottle-‐‑‒neck is random access. SSD
has 10 times faster random access than MD
• If you need large disk or donʼ’t have cost, you may use
tablespace in only hot table, it is very efficiency.
• Use
l arge
R AID
c ache
c ard
• PostgreSQL CHECKPOINT does not consider fsync()
schedule at all. It is caused very heavy disk write and
fail overL
• If you use large raid cache card, it may prevent a little.
Copyright(c)2013 NTT Corp. All Rights Reserved.
35
36. Summary
• pg_statsinfo
• Monitor and Collect PostgreSQL Statistics and Activities with
time series
• BSD License
• http://pgstatsinfo.projects.pgfoundry.org /pg_̲statsinfo-‐‑‒ja.html
• Collect whole of statistics an activities for DB admin needed
• If youʼ’d like to another new report, Create reporting SQL from
collecting information
• pg_stats_reporter
• Visualize PostgreSQL Statistics and Activities that are
collected by pg_̲statsinfo
• BSD License
• http://pgstatsinfo.projects.pgfoundry.org /pg_̲stats_̲reporter-‐‑‒ja.html
• jQuery Based Useful Interface
• Report index feature is also useful
• It is easy to improve software, because it is created by PHP
+ JavaScript
• It is also easy to submit patchJ
Copyright(c)2013 NTT Corp. All Rights Reserved.
36