SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
Let’s scale-out PostgreSQL
using Citus
Noriyoshi Shinoda
November 22, 2018
PostgreSQL Conference Japan 2018
Speaker
Noriyoshi Shinoda
Affiliated company
• Hewlett-Packard Enterprise Japan
Current work
• System design, tuning, consulting on PostgreSQL, Oracle Database, Microsoft SQL
Server, Vertica, Sybase ASE, etc. related to RDBMS
• Oracle ACE
• Written 15 books related to Oracle Database
• Investigation and verification on open source products
URL
• Published documents
http://slideshare.net/noriyoshishinoda/
• Oracle ACE
https://apex.oracle.com/pls/apex/f?p=19297:4:::NO:4:P4_ID:2780
2
Agenda
What’s Citus?
Let’s try!
Architecture
Restriction
Behavior when trouble occurs
This slides is based on Citus Community Edition 8.0-8
3
What’s Citus?
4
What’s Citus?
What’s Citus?
5
 Achieves scale-out environment for PostgreSQL
• Parallel query and partitioning feature across multiple nodes
Implemented as PostgreSQL Extension
Developed by Citusdata
• Community Edition is open source
• To use features such as online rebalancing, Enterprise Edition is required
It does not include the following features
• Automatic failover
• Automatic data rebalance
• Operational features such as backup
What’s Citus?
Instance configuration
6
Coordinator Node
• PostgreSQL instance that accepts connections from client
• Manage meta-data
Worker Node
• PostgreSQL instance that manages the actual data
• Do not communicate between Worker Nodes
Install citus EXTENSION on all nodes
Client
Data
Data
Data
Meta data
Worker#1
Worker#2
Worker#3
citus
citus
citus
Coordinator
citus
What’s Citus?
citus EXTENSION
Installation
• Install on Coordinator Node and Worker Node
• Execute the CREATE EXTENSION statement on all databases for creating tables
• Installation binaries are the same on all nodes
• Install extensions required for the application (such as pgcrypto) also to all nodes
7
What’s Citus?
Table configuration
8
Distributed Table
• Table that stores the data distributedly
• Suitable for fact table
• Specify a column as distribution key (determined distribution destination table by the
scope of hash values)
• Specify the number of partitions
Parameter citus.shard_count (default 32)
• Can create replicas on difference Worker Nodes
Parameter citus.shard_replication_factor (default 1 = No replica provided)
"DEF"
"GHI"
Worker#1
Worker#2
Worker#3
citus
citus
citus
Coordinator
citus
"ABC"
"DEF"
"GHI""ABC"
What’s Citus?
Table configuration
9
Reference Table
• Table that stores the same data in all nodes
• Suitable for dimension table
"ABC"
"ABC"
"ABC"
Worker#1
Worker#2
Worker#3
citus
citus
citus
Coordinator
citus
Let’s try!
10
Let’s try!
Install
11
$ ./configure
$ make
# make install
Build from source code
postgres=# SELECT * FROM pg_available_extensions WHERE
name='citus’;
-[ RECORD 1 ]-----+---------------------------
name | citus
default_version | 8.0-8
Installed_version |
comment | Citus distributed database
Confirmation of EXTENSION
Let’s try!
Operation on all instances
12
postgres=# SHOW shared_preload_libraries ;
shared_preload_libraries
--------------------------
citus
(1 row)
Specify 'citus' for parameter shared_preload_libraries
postgres=# CREATE EXTENSION citus ;
CREATE EXTENSION
Loading EXTENSION
Let’s try!
Authentication settings
13
host all all coordhost1/32 trust
Libpq connection from Coordinator Node to Worker Node
There is no password authentication mechanism for communication between
Coordinator Node and Worker Node
Password less connection setup required
Configuration example of pg_hba.conf file (Worker Node)
Also possible to use .pgpass file (Coordinator Node)
Authentication information can be specified in the parameter
citus.node_conninfo (SSL setting etc)
Let’s try!
Operation on Coordinator Node
14
postgres=# SELECT master_add_node('wrkhost1', 5432) ;
master_add_node
------------------------------------------------
(1,1,wrkhost1,5432,default,f,t,primary,default)
(1 row)
Register Worker Node to Coordinator Node (for each database)
Specify host name and port number in master_add_node function
Let’s try!
Operation on Coordinator Node
15
postgres=> SELECT nodeid, nodename, isactive FROM pg_dist_node ;
nodeid | nodename | isactive
--------+-----------+----------
1 | wrkhost1 | t
2 | wrkhost2 | t
3 | wrkhost3 | t
(3 rows)
Confirmation
Let’s try!
Create Distributed Table
16
postgres=> CREATE TABLE dist1(key1 NUMERIC, val1 VARCHAR(10)) ;
CREATE TABLE
postgres=> SELECT create_distributed_table('dist1', 'key1') ;
create_distributed_table
--------------------------
(1 row)
postgres=> SET citus.shard_count = 6 ;
SET
postgres=> SET citus.shard_replication_factor = 2 ;
SET
Specifying the number of distributed tables and the number of replicas
Example for table creation
Let’s try!
Create Distributed Table
17
Same configuration of the table is automatically created in the Worker Node
• The table name is "{Origin table name}_{ShardID}"
• TABLESPACE clause does not propagate
Number of tables created on Worker Node
• In the example of the previous slide, the distributed table# 6 x replica# 2 / Worker
Node# 3 = 4 table is created
Let’s try!
Create Distributed Table
18
Due to replica settings, tables of the same name are created in different Worker
Nodes.
• The same data is stored in the same name table
Coordinator Worker#1 Worker#2 Worker#3
dist1
dist1_102046 dist1_102046
dist1_102047 dist1_102047
dist1_102048 dist1_102048
dist1_102049 dist1_102049
dist1_102050 dist1_102050
dist1_102051 dist1_102051
Let’s try!
Create Reference Table
19
postgres=> CREATE TABLE ref1(key1 NUMERIC, val1 VARCHAR(10)) ;
CREATE TABLE
postgres=> SELECT create_reference_table('ref1') ;
create_reference_table
------------------------
(1 row)
Table creation
postgres=> d
List of relations
Schema | Name | Type | Owner
--------+--------------+-------+-------
public | ref1_102084 | table | demo
(1 row)
Table confirmation on Worker Node
Architecture
20
Architecture
Processes
21
bgworker: task tracker
• Running on Coordinator Node and Worker Node
bgworker: Citus Maintenance Daemon
• Running on Coordinator Node and Worker Nod
• Send usage data to https://reports.citusdata.com every 24 hours
Backend processes
• Start up on Worker Node by connecting from Coordinator Node
• Execute SQL statement to Distributed Table or Reference Table
• Execute dump_local_wait_edges function every 2 seconds
• Search pg_prepared_xacts view every minute
Architecture
Session management
22
Connection between Coordinator Node and Worker Node
• Establishing session when executing the first SQL statement in the transaction
• Disconnect when transaction is completed
• Connection pool is not used (Available on Enterprise Edition)
Client Worker#2
citus
Coordinator
citus
③ CONNECT
④ BEGIN
⑤ UPDATE
⑦ COMMIT
⑧ DISCONNECT
① BEGIN
② UPDATE
⑥ COMMIT
Architecture
Executed SQL
23
Process to be executed on Coordinator Node
• Final sort (ORDER BY)
• Operation of SEQUENCE (Include SERIAL column and GENERATED AS IDENTITY
column)
• Processing objects other than tables and indexes
Process to propagate to Worker Node
• VACUUM statement
• ANALYZE statement
• CREATE INDEX statement
• ALTER TABLE statement (some restrictions)
Other DML
• Select the table on the Worker Node that issues the SQL statement by specifying the
distributed key column
Architecture
Executed SQL
24
PostgreSQL API to submit SQL to Worker Node
• PQsendQuery
• PQsendQueryParams
• Use asynchronous API
Architecture
Executed SQL
25
When the distributed key string can be specified
SQL executed by the application ⇒ SQL executed on Worker Node
SELECT * FROM dist1 WHERE key1 = 20
SELECT key1, val1 FROM public.dist1_102131 dist1 WHERE
(key1 OPERATOR(pg_catalog.=) 20)
UPDATE dist1 SET val1 = 'update' WHERE key1 = 20
UPDATE public.dist1_102131 dist1 SET val1 = 'update'::character
varying WHERE (key1 OPERATOR(pg_catalog.=) 20)
Architecture
Executed SQL
26
Access to specific Worker Node only when Distributed Key can be specified
postgres=> EXPLAIN SELECT * FROM dist1 WHERE key1 = 1000 ;
QUERY PLAN
-------------------------------------------------------------------
Custom Scan (Citus Router) (cost=0.00..0.00 rows=0 width=0)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=wrkhost1 port=5432 dbname=postgres
-> Seq Scan on dist1_102078 dist1 (cost=0.00..2973.04
rows=1 width=12)
Filter: (key1 = '1000'::numeric)
(7 rows)
Architecture
Executed SQL
27
When the distributed key can not be specified
SQL executed by the application ⇒ SQL executed on Worker Node
• Changing the table name and putting it to all Worker Nodes
SELECT * FROM dist1 WHERE key1 != 20
COPY (SELECT key1, val1 FROM dist1_102146 dist1 WHERE
(key1 OPERATOR(pg_catalog.<>) 20)) TO STDOUT
COPY (SELECT key1, val1 FROM dist1_102147 dist1 WHERE
(key1 OPERATOR(pg_catalog.<>) 20)) TO STDOUT
…
Architecture
Executed SQL
28
Execution plan when the distributed key can not be specified
postgres=> SET citus.explain_all_tasks = on ;
SET
postgres=> EXPLAIN SELECT * FROM dist1 ;
QUERY PLAN
-------------------------------------------------------------------
Aggregate (cost=0.00..0.00 rows=0 width=0)
-> Custom Scan (Citus Real-Time) (cost=0.00..0.00 rows=0
width=0)
Task Count: 6
Tasks Shown: All
-> Task
Node: host=wrkhost1 port=5001 dbname=demodb
-> Aggregate (cost=2973.04..2973.05 rows=1 width=8)
…
Architecture
Executed SQL
29
Joining Distributed Table and Reference Table
• Joining in Worker Node
SQL executed by the application ⇒ SQL executed on Worker Node
SELECT * FROM dist1 d1 INNER JOIN ref1 r1 ON d1.key1=r1.key1 WHERE
d1.key1=2
SELECT d1.key1, d1.val1, r1.key1, r1.val1 FROM (public.dist1_102221
d1 JOIN public.ref1_102084 r1 ON ((d1.key1 OPERATOR(pg_catalog.=)
r1.key1))) WHERE (d1.key1 OPERATOR(pg_catalog.=) (2)::numeric)
Architecture
Executed SQL
30
Joining Distributed Tables
• Error may occur by default
• Executable by specifying the parameter citus.enable_repartition_joins to 'on’
Recommends placing the table on the same node with the same column value
CREATE TABLE event(tenant_id INT, event_id BIGINT, … ) ;
SELECT create_distributed_table('event', 'tenant_id') ;
CREATE TABLE page(tenant_id INT, page_id INT, … ) ;
SELECT create_distributed_table('page', 'tenant_id',
colocate_with => 'event') ;
Architecture
Parameter settings
31
Major parameters (38 in total)
• citus.node_connection_timeout
• citus.partition_buffer_size
• citus.recover_2pc_interval
• citus.remote_task_check_interval
• citus.shard_count
• citus.shard_max_size
• citus.shard_placement_policy
• citus.shard_replication_factor
• citus.subquery_pushdown
• citus.task_assignment_policy
• citus.task_executor_type
• citus.task_tracker_delay
• citus.use_secondary_nodes
• …
Architecture
Catalogs
32
Major catalogs (11 in total)
Catalog name Description
pg_dist_authinfo Store connection information (Enterprise Edition)
pg_dist_colocation Co-location groups is stored
pg_dist_node Information about the worker nodes
pg_dist_node_metadata Information about ServerID
pg_dist_partition Information about the distribution column
pg_dist_placement State of shard placements
pg_dist_poolinfo
Information about Connection pooling (Enerprise
Edition)
pg_dist_shard Information about distributed table
Restriction
33
Restriction
SQL can not be executed
34
The following syntax can not be executed for Distributed Table
• Updating distributed key columns (UPDATE / INSERT ON CONFLICT)
• SELECT FOR UPDATE statement (if creating a replica)
• TABLESAMPLE clause
• WITH RECURSIVE clause
• Generate_series function for INSERT VALUES statement
Described in the manual
postgres=> BEGIN ;
BEGIN
postgres=> SELECT * FROM dist1 WHERE key1=100 FOR UPDATE ;
ERROR: could not run distributed query with FOR UPDATE/SHARE
commands
HINT: Consider using an equality filter on the distributed table's
partition column.
Restriction
Restricted SQL
35
The following syntax is restricted for execution
• Correlated subquery
• GROUPING SETS clause
• PARTITION BY clause
• Joining Local Table and Distributed Table
• Trigger created on Coordinator Node
• INSERT SELECT ON CONFLICT statement
postgres=> SELECT COUNT(*) FROM dist1 d INNER JOIN local1 l ON
d.key1 = l.key1 ;
ERROR: relation local1 is not distributed
postgres=> SELECT COUNT(*) FROM dist1 d INNER JOIN
(SELECT * FROM local1) l ON d.key1 = l.key1 ;
Restriction
ISOLATION LEVEL
36
Connection from client to Coordinator Node
• No restriction
Connection from Coordinator Node to Worker Node
• READ COMMITED (hard-coded)
BEGIN
Worker#1
Worker#2
Worker#3
citus
citus
citus
BEGIN Coordinator
citus
SERIALIZABLE
REPEATABLE READ
READ COMMITED
READ UNCOMMITED READ COMMITED
Client
Restriction
ALTER TABLE statement
37
ALTER TABLE statement can only be executed as follows
• Add / Drop columns
• Setting restriction
• Partition administration
• Modify data type of columns
Control automatic propagation of DDL
• Parameter citus.enable_ddl_propagation (default 'on')
postgres=> ALTER TABLE dist1 SET UNLOGGED ;
ERROR: alter table command is currently unsupported
DETAIL: Only ADD|DROP COLUMN, SET|DROP NOT NULL, SET|DROP DEFAULT,
ADD|DROP CONSTRAINT, SET (), RESET (), ATTACH|DETACH PARTITION and
TYPE subcommands are supported.
Restriction
SQL that does not propagate
38
DATABASE and USER information should be identical in all nodes
• CREATE USER / CREATE DATABASE statement does not propagate
• Warning is output
• Function run_command_on_workers is provided to execute SQL statements on
Worker Node
postgres=# CREATE DATABASE demodb ;
NOTICE: Citus partially supports CREATE DATABASE for distributed
databases
DETAIL: Citus does not propagate CREATE DATABASE command to workers
HINT: You can manually create a database and its extensions on
workers.
CREATE DATABASE
Behavior when trouble occurs
39
Behavior when trouble occurs
Coordinator Node down
40
Client can not connect if Coordinator Node down
Automatic failover feature is not provided
Streaming Replication + Clusterware are required
Behavior when trouble occurs
Worker Node down
41
When the Worker Node is down, SQL that updates the entire data including
the stopped node can not be executed
postgres=> DELETE FROM dist1 ;
ERROR: connection error: wrkhost1:5432
DETAIL: could not connect to server: Connection refused
Is the server running on host "workhost1" (192.168.1.101)
and accepting
TCP/IP connections on port 5432?
SQL which operates other than the data of the stopped node is warned but can
be executed
postgres=> SELECT COUNT(*) FROM dist1 WHERE key1 = 100 ;
WARNING: connection error: wrkhost1:5432
DETAIL: could not send data to server: Connection refused
Could not send startup packet: Connection refused
…
Behavior when trouble occurs
Worker Node down
42
Warnings are output for records with replicas, but updatable
• Maintenance of replica of stopped Worker Node is not maintained
postgres=> DELETE FROM dist1 WHERE key1 = 1 ;
WARNING: connection error: wrkhost1:5432
DETAIL: could not send data to server: Connection refused
could not send SSL negotiation packet: Connection refused
DELETE 1
Behavior when trouble occurs
Combined with streaming replication
43
Streaming replication environment for load balancing is available
• Create streaming replication environment for all nodes
• Set citus.use_secondary_nodes = 'always' for Standby instance of Coordinator Node
• SELECT statement is executed on the standby instance of the Worker Node
• Standby instance is registered in master_add_secondary_node function
Worker#1
Worker#2Client Primary
Coordinator
Worker#1
Worker#2Standby
Worker#3
Client Standby
Coordinator
Primary
Worker#3
Streaming Replication
citus.use_secondary_nodes = 'always'
Summary
44
Summary
Some restriction exists, but easy to scale-out
45
It is relatively easy to build a scale-out environment
Possibility of performance improvement by parallel query + partitioning across
nodes
It is necessary to implement of fault tolerance and backup by yourself
Because restrictions of SQL statement exist, prior application verification is
recommended
Please note the difference from Enterprise Edition
Summary
Reference information URL
46
Product manuals
https://docs.citusdata.com/en/v8.0/
Performance comparison video
https://www.youtube.com/watch?v=g3H4nGsJsl0
Getting Started
https://docs.citusdata.com/en/v8.0/portals/getting_started.html
Use Cases
https://docs.citusdata.com/en/v8.0/portals/use_cases.html
API / Reference
https://docs.citusdata.com/en/v8.0/portals/reference.html
GitHub
https://github.com/citusdata/citus
Thank you
noriyoshi.shinoda@hpe.com
47

Más contenido relacionado

La actualidad más candente

patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
hyeongchae lee
 

La actualidad más candente (20)

Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
 
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
 
HOT Understanding this important update optimization
HOT Understanding this important update optimizationHOT Understanding this important update optimization
HOT Understanding this important update optimization
 
JSON improvements in MySQL 8.0
JSON improvements in MySQL 8.0JSON improvements in MySQL 8.0
JSON improvements in MySQL 8.0
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIXHigh Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
 
Beyond EXPLAIN: Query Optimization From Theory To Code
Beyond EXPLAIN: Query Optimization From Theory To CodeBeyond EXPLAIN: Query Optimization From Theory To Code
Beyond EXPLAIN: Query Optimization From Theory To Code
 
Postgre sql best_practices
Postgre sql best_practicesPostgre sql best_practices
Postgre sql best_practices
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
EDB Failover Manager for Seamless Failover & Switchover
EDB Failover Manager for Seamless Failover & SwitchoverEDB Failover Manager for Seamless Failover & Switchover
EDB Failover Manager for Seamless Failover & Switchover
 
[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQL[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQL
 
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 

Similar a Let's scale-out PostgreSQL using Citus (English)

Discard inport exchange table & tablespace
Discard inport exchange table & tablespaceDiscard inport exchange table & tablespace
Discard inport exchange table & tablespace
Marco Tusa
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
PostgresOpen
 
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
SegFaultConf
 

Similar a Let's scale-out PostgreSQL using Citus (English) (20)

Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
 
Discard inport exchange table & tablespace
Discard inport exchange table & tablespaceDiscard inport exchange table & tablespace
Discard inport exchange table & tablespace
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
MySQL Utilities -- PyTexas 2015
MySQL Utilities -- PyTexas 2015MySQL Utilities -- PyTexas 2015
MySQL Utilities -- PyTexas 2015
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
 
Ten Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQLTen Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQL
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
 
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
 
Python SQite3 database Tutorial | SQlite Database
Python SQite3 database Tutorial | SQlite DatabasePython SQite3 database Tutorial | SQlite Database
Python SQite3 database Tutorial | SQlite Database
 
Basic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database AdministratorsBasic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database Administrators
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015
 
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
Openstack 101
Openstack 101Openstack 101
Openstack 101
 
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
 

Más de Noriyoshi Shinoda

Más de Noriyoshi Shinoda (20)

PostgreSQL Unconference #5 ICU Collation
PostgreSQL Unconference #5 ICU CollationPostgreSQL Unconference #5 ICU Collation
PostgreSQL Unconference #5 ICU Collation
 
Babelfish Compatibility
Babelfish CompatibilityBabelfish Compatibility
Babelfish Compatibility
 
PostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVSPostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVS
 
PostgreSQL Conference Japan 2021 B2 Citus 10
PostgreSQL Conference Japan 2021 B2 Citus 10PostgreSQL Conference Japan 2021 B2 Citus 10
PostgreSQL Conference Japan 2021 B2 Citus 10
 
Citus 10 verification result (Japanese)
Citus 10 verification result (Japanese)Citus 10 verification result (Japanese)
Citus 10 verification result (Japanese)
 
PostgreSQL Unconference #26 No Error on PostgreSQL
PostgreSQL Unconference #26 No Error on PostgreSQLPostgreSQL Unconference #26 No Error on PostgreSQL
PostgreSQL Unconference #26 No Error on PostgreSQL
 
PostgreSQL 14 Beta1 New Features with Examples (Japanese)
PostgreSQL 14 Beta1 New Features with Examples (Japanese)PostgreSQL 14 Beta1 New Features with Examples (Japanese)
PostgreSQL 14 Beta1 New Features with Examples (Japanese)
 
PostgreSQL 14 Beta 1 New Features with Examples (English)
PostgreSQL 14 Beta 1 New Features with Examples (English)PostgreSQL 14 Beta 1 New Features with Examples (English)
PostgreSQL 14 Beta 1 New Features with Examples (English)
 
Add PLEASE clause to Oracle Database
Add PLEASE clause to Oracle DatabaseAdd PLEASE clause to Oracle Database
Add PLEASE clause to Oracle Database
 
PostgreSQL 12 New Features with Examples (English) GA
PostgreSQL 12 New Features with Examples (English) GAPostgreSQL 12 New Features with Examples (English) GA
PostgreSQL 12 New Features with Examples (English) GA
 
db tech showcase 2019 D10 Oracle Database New Features
db tech showcase 2019 D10 Oracle Database New Featuresdb tech showcase 2019 D10 Oracle Database New Features
db tech showcase 2019 D10 Oracle Database New Features
 
EDB Postgres Vision 2019
EDB Postgres Vision 2019 EDB Postgres Vision 2019
EDB Postgres Vision 2019
 
PostgreSQL 12 Beta 1 New Features with Examples (Japanese)
PostgreSQL 12 Beta 1 New Features  with Examples (Japanese)PostgreSQL 12 Beta 1 New Features  with Examples (Japanese)
PostgreSQL 12 Beta 1 New Features with Examples (Japanese)
 
PostgreSQL 12 Beta 1 New Features with Examples (English)
PostgreSQL 12 Beta 1 New Features with Examples (English)PostgreSQL 12 Beta 1 New Features with Examples (English)
PostgreSQL 12 Beta 1 New Features with Examples (English)
 
Let's scale-out PostgreSQL using Citus (Japanese)
Let's scale-out PostgreSQL using Citus (Japanese)Let's scale-out PostgreSQL using Citus (Japanese)
Let's scale-out PostgreSQL using Citus (Japanese)
 
PostgreSQL 11 New Features With Examples (Japanese)
PostgreSQL 11 New Features With Examples (Japanese)PostgreSQL 11 New Features With Examples (Japanese)
PostgreSQL 11 New Features With Examples (Japanese)
 
PostgreSQL 11 New Features With Examples (English)
PostgreSQL 11 New Features With Examples (English)PostgreSQL 11 New Features With Examples (English)
PostgreSQL 11 New Features With Examples (English)
 
Citus 7.5 Beta 検証結果
Citus 7.5 Beta 検証結果Citus 7.5 Beta 検証結果
Citus 7.5 Beta 検証結果
 
PostgreSQL 11 New Features Japanese version (Beta 1)
PostgreSQL 11 New Features Japanese version (Beta 1)PostgreSQL 11 New Features Japanese version (Beta 1)
PostgreSQL 11 New Features Japanese version (Beta 1)
 
PostgreSQL 11 New Features English version (Beta 1)
PostgreSQL 11 New Features English version (Beta 1)PostgreSQL 11 New Features English version (Beta 1)
PostgreSQL 11 New Features English version (Beta 1)
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Let's scale-out PostgreSQL using Citus (English)

  • 1. Let’s scale-out PostgreSQL using Citus Noriyoshi Shinoda November 22, 2018 PostgreSQL Conference Japan 2018
  • 2. Speaker Noriyoshi Shinoda Affiliated company • Hewlett-Packard Enterprise Japan Current work • System design, tuning, consulting on PostgreSQL, Oracle Database, Microsoft SQL Server, Vertica, Sybase ASE, etc. related to RDBMS • Oracle ACE • Written 15 books related to Oracle Database • Investigation and verification on open source products URL • Published documents http://slideshare.net/noriyoshishinoda/ • Oracle ACE https://apex.oracle.com/pls/apex/f?p=19297:4:::NO:4:P4_ID:2780 2
  • 3. Agenda What’s Citus? Let’s try! Architecture Restriction Behavior when trouble occurs This slides is based on Citus Community Edition 8.0-8 3
  • 5. What’s Citus? What’s Citus? 5  Achieves scale-out environment for PostgreSQL • Parallel query and partitioning feature across multiple nodes Implemented as PostgreSQL Extension Developed by Citusdata • Community Edition is open source • To use features such as online rebalancing, Enterprise Edition is required It does not include the following features • Automatic failover • Automatic data rebalance • Operational features such as backup
  • 6. What’s Citus? Instance configuration 6 Coordinator Node • PostgreSQL instance that accepts connections from client • Manage meta-data Worker Node • PostgreSQL instance that manages the actual data • Do not communicate between Worker Nodes Install citus EXTENSION on all nodes Client Data Data Data Meta data Worker#1 Worker#2 Worker#3 citus citus citus Coordinator citus
  • 7. What’s Citus? citus EXTENSION Installation • Install on Coordinator Node and Worker Node • Execute the CREATE EXTENSION statement on all databases for creating tables • Installation binaries are the same on all nodes • Install extensions required for the application (such as pgcrypto) also to all nodes 7
  • 8. What’s Citus? Table configuration 8 Distributed Table • Table that stores the data distributedly • Suitable for fact table • Specify a column as distribution key (determined distribution destination table by the scope of hash values) • Specify the number of partitions Parameter citus.shard_count (default 32) • Can create replicas on difference Worker Nodes Parameter citus.shard_replication_factor (default 1 = No replica provided) "DEF" "GHI" Worker#1 Worker#2 Worker#3 citus citus citus Coordinator citus "ABC" "DEF" "GHI""ABC"
  • 9. What’s Citus? Table configuration 9 Reference Table • Table that stores the same data in all nodes • Suitable for dimension table "ABC" "ABC" "ABC" Worker#1 Worker#2 Worker#3 citus citus citus Coordinator citus
  • 11. Let’s try! Install 11 $ ./configure $ make # make install Build from source code postgres=# SELECT * FROM pg_available_extensions WHERE name='citus’; -[ RECORD 1 ]-----+--------------------------- name | citus default_version | 8.0-8 Installed_version | comment | Citus distributed database Confirmation of EXTENSION
  • 12. Let’s try! Operation on all instances 12 postgres=# SHOW shared_preload_libraries ; shared_preload_libraries -------------------------- citus (1 row) Specify 'citus' for parameter shared_preload_libraries postgres=# CREATE EXTENSION citus ; CREATE EXTENSION Loading EXTENSION
  • 13. Let’s try! Authentication settings 13 host all all coordhost1/32 trust Libpq connection from Coordinator Node to Worker Node There is no password authentication mechanism for communication between Coordinator Node and Worker Node Password less connection setup required Configuration example of pg_hba.conf file (Worker Node) Also possible to use .pgpass file (Coordinator Node) Authentication information can be specified in the parameter citus.node_conninfo (SSL setting etc)
  • 14. Let’s try! Operation on Coordinator Node 14 postgres=# SELECT master_add_node('wrkhost1', 5432) ; master_add_node ------------------------------------------------ (1,1,wrkhost1,5432,default,f,t,primary,default) (1 row) Register Worker Node to Coordinator Node (for each database) Specify host name and port number in master_add_node function
  • 15. Let’s try! Operation on Coordinator Node 15 postgres=> SELECT nodeid, nodename, isactive FROM pg_dist_node ; nodeid | nodename | isactive --------+-----------+---------- 1 | wrkhost1 | t 2 | wrkhost2 | t 3 | wrkhost3 | t (3 rows) Confirmation
  • 16. Let’s try! Create Distributed Table 16 postgres=> CREATE TABLE dist1(key1 NUMERIC, val1 VARCHAR(10)) ; CREATE TABLE postgres=> SELECT create_distributed_table('dist1', 'key1') ; create_distributed_table -------------------------- (1 row) postgres=> SET citus.shard_count = 6 ; SET postgres=> SET citus.shard_replication_factor = 2 ; SET Specifying the number of distributed tables and the number of replicas Example for table creation
  • 17. Let’s try! Create Distributed Table 17 Same configuration of the table is automatically created in the Worker Node • The table name is "{Origin table name}_{ShardID}" • TABLESPACE clause does not propagate Number of tables created on Worker Node • In the example of the previous slide, the distributed table# 6 x replica# 2 / Worker Node# 3 = 4 table is created
  • 18. Let’s try! Create Distributed Table 18 Due to replica settings, tables of the same name are created in different Worker Nodes. • The same data is stored in the same name table Coordinator Worker#1 Worker#2 Worker#3 dist1 dist1_102046 dist1_102046 dist1_102047 dist1_102047 dist1_102048 dist1_102048 dist1_102049 dist1_102049 dist1_102050 dist1_102050 dist1_102051 dist1_102051
  • 19. Let’s try! Create Reference Table 19 postgres=> CREATE TABLE ref1(key1 NUMERIC, val1 VARCHAR(10)) ; CREATE TABLE postgres=> SELECT create_reference_table('ref1') ; create_reference_table ------------------------ (1 row) Table creation postgres=> d List of relations Schema | Name | Type | Owner --------+--------------+-------+------- public | ref1_102084 | table | demo (1 row) Table confirmation on Worker Node
  • 21. Architecture Processes 21 bgworker: task tracker • Running on Coordinator Node and Worker Node bgworker: Citus Maintenance Daemon • Running on Coordinator Node and Worker Nod • Send usage data to https://reports.citusdata.com every 24 hours Backend processes • Start up on Worker Node by connecting from Coordinator Node • Execute SQL statement to Distributed Table or Reference Table • Execute dump_local_wait_edges function every 2 seconds • Search pg_prepared_xacts view every minute
  • 22. Architecture Session management 22 Connection between Coordinator Node and Worker Node • Establishing session when executing the first SQL statement in the transaction • Disconnect when transaction is completed • Connection pool is not used (Available on Enterprise Edition) Client Worker#2 citus Coordinator citus ③ CONNECT ④ BEGIN ⑤ UPDATE ⑦ COMMIT ⑧ DISCONNECT ① BEGIN ② UPDATE ⑥ COMMIT
  • 23. Architecture Executed SQL 23 Process to be executed on Coordinator Node • Final sort (ORDER BY) • Operation of SEQUENCE (Include SERIAL column and GENERATED AS IDENTITY column) • Processing objects other than tables and indexes Process to propagate to Worker Node • VACUUM statement • ANALYZE statement • CREATE INDEX statement • ALTER TABLE statement (some restrictions) Other DML • Select the table on the Worker Node that issues the SQL statement by specifying the distributed key column
  • 24. Architecture Executed SQL 24 PostgreSQL API to submit SQL to Worker Node • PQsendQuery • PQsendQueryParams • Use asynchronous API
  • 25. Architecture Executed SQL 25 When the distributed key string can be specified SQL executed by the application ⇒ SQL executed on Worker Node SELECT * FROM dist1 WHERE key1 = 20 SELECT key1, val1 FROM public.dist1_102131 dist1 WHERE (key1 OPERATOR(pg_catalog.=) 20) UPDATE dist1 SET val1 = 'update' WHERE key1 = 20 UPDATE public.dist1_102131 dist1 SET val1 = 'update'::character varying WHERE (key1 OPERATOR(pg_catalog.=) 20)
  • 26. Architecture Executed SQL 26 Access to specific Worker Node only when Distributed Key can be specified postgres=> EXPLAIN SELECT * FROM dist1 WHERE key1 = 1000 ; QUERY PLAN ------------------------------------------------------------------- Custom Scan (Citus Router) (cost=0.00..0.00 rows=0 width=0) Task Count: 1 Tasks Shown: All -> Task Node: host=wrkhost1 port=5432 dbname=postgres -> Seq Scan on dist1_102078 dist1 (cost=0.00..2973.04 rows=1 width=12) Filter: (key1 = '1000'::numeric) (7 rows)
  • 27. Architecture Executed SQL 27 When the distributed key can not be specified SQL executed by the application ⇒ SQL executed on Worker Node • Changing the table name and putting it to all Worker Nodes SELECT * FROM dist1 WHERE key1 != 20 COPY (SELECT key1, val1 FROM dist1_102146 dist1 WHERE (key1 OPERATOR(pg_catalog.<>) 20)) TO STDOUT COPY (SELECT key1, val1 FROM dist1_102147 dist1 WHERE (key1 OPERATOR(pg_catalog.<>) 20)) TO STDOUT …
  • 28. Architecture Executed SQL 28 Execution plan when the distributed key can not be specified postgres=> SET citus.explain_all_tasks = on ; SET postgres=> EXPLAIN SELECT * FROM dist1 ; QUERY PLAN ------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Real-Time) (cost=0.00..0.00 rows=0 width=0) Task Count: 6 Tasks Shown: All -> Task Node: host=wrkhost1 port=5001 dbname=demodb -> Aggregate (cost=2973.04..2973.05 rows=1 width=8) …
  • 29. Architecture Executed SQL 29 Joining Distributed Table and Reference Table • Joining in Worker Node SQL executed by the application ⇒ SQL executed on Worker Node SELECT * FROM dist1 d1 INNER JOIN ref1 r1 ON d1.key1=r1.key1 WHERE d1.key1=2 SELECT d1.key1, d1.val1, r1.key1, r1.val1 FROM (public.dist1_102221 d1 JOIN public.ref1_102084 r1 ON ((d1.key1 OPERATOR(pg_catalog.=) r1.key1))) WHERE (d1.key1 OPERATOR(pg_catalog.=) (2)::numeric)
  • 30. Architecture Executed SQL 30 Joining Distributed Tables • Error may occur by default • Executable by specifying the parameter citus.enable_repartition_joins to 'on’ Recommends placing the table on the same node with the same column value CREATE TABLE event(tenant_id INT, event_id BIGINT, … ) ; SELECT create_distributed_table('event', 'tenant_id') ; CREATE TABLE page(tenant_id INT, page_id INT, … ) ; SELECT create_distributed_table('page', 'tenant_id', colocate_with => 'event') ;
  • 31. Architecture Parameter settings 31 Major parameters (38 in total) • citus.node_connection_timeout • citus.partition_buffer_size • citus.recover_2pc_interval • citus.remote_task_check_interval • citus.shard_count • citus.shard_max_size • citus.shard_placement_policy • citus.shard_replication_factor • citus.subquery_pushdown • citus.task_assignment_policy • citus.task_executor_type • citus.task_tracker_delay • citus.use_secondary_nodes • …
  • 32. Architecture Catalogs 32 Major catalogs (11 in total) Catalog name Description pg_dist_authinfo Store connection information (Enterprise Edition) pg_dist_colocation Co-location groups is stored pg_dist_node Information about the worker nodes pg_dist_node_metadata Information about ServerID pg_dist_partition Information about the distribution column pg_dist_placement State of shard placements pg_dist_poolinfo Information about Connection pooling (Enerprise Edition) pg_dist_shard Information about distributed table
  • 34. Restriction SQL can not be executed 34 The following syntax can not be executed for Distributed Table • Updating distributed key columns (UPDATE / INSERT ON CONFLICT) • SELECT FOR UPDATE statement (if creating a replica) • TABLESAMPLE clause • WITH RECURSIVE clause • Generate_series function for INSERT VALUES statement Described in the manual postgres=> BEGIN ; BEGIN postgres=> SELECT * FROM dist1 WHERE key1=100 FOR UPDATE ; ERROR: could not run distributed query with FOR UPDATE/SHARE commands HINT: Consider using an equality filter on the distributed table's partition column.
  • 35. Restriction Restricted SQL 35 The following syntax is restricted for execution • Correlated subquery • GROUPING SETS clause • PARTITION BY clause • Joining Local Table and Distributed Table • Trigger created on Coordinator Node • INSERT SELECT ON CONFLICT statement postgres=> SELECT COUNT(*) FROM dist1 d INNER JOIN local1 l ON d.key1 = l.key1 ; ERROR: relation local1 is not distributed postgres=> SELECT COUNT(*) FROM dist1 d INNER JOIN (SELECT * FROM local1) l ON d.key1 = l.key1 ;
  • 36. Restriction ISOLATION LEVEL 36 Connection from client to Coordinator Node • No restriction Connection from Coordinator Node to Worker Node • READ COMMITED (hard-coded) BEGIN Worker#1 Worker#2 Worker#3 citus citus citus BEGIN Coordinator citus SERIALIZABLE REPEATABLE READ READ COMMITED READ UNCOMMITED READ COMMITED Client
  • 37. Restriction ALTER TABLE statement 37 ALTER TABLE statement can only be executed as follows • Add / Drop columns • Setting restriction • Partition administration • Modify data type of columns Control automatic propagation of DDL • Parameter citus.enable_ddl_propagation (default 'on') postgres=> ALTER TABLE dist1 SET UNLOGGED ; ERROR: alter table command is currently unsupported DETAIL: Only ADD|DROP COLUMN, SET|DROP NOT NULL, SET|DROP DEFAULT, ADD|DROP CONSTRAINT, SET (), RESET (), ATTACH|DETACH PARTITION and TYPE subcommands are supported.
  • 38. Restriction SQL that does not propagate 38 DATABASE and USER information should be identical in all nodes • CREATE USER / CREATE DATABASE statement does not propagate • Warning is output • Function run_command_on_workers is provided to execute SQL statements on Worker Node postgres=# CREATE DATABASE demodb ; NOTICE: Citus partially supports CREATE DATABASE for distributed databases DETAIL: Citus does not propagate CREATE DATABASE command to workers HINT: You can manually create a database and its extensions on workers. CREATE DATABASE
  • 40. Behavior when trouble occurs Coordinator Node down 40 Client can not connect if Coordinator Node down Automatic failover feature is not provided Streaming Replication + Clusterware are required
  • 41. Behavior when trouble occurs Worker Node down 41 When the Worker Node is down, SQL that updates the entire data including the stopped node can not be executed postgres=> DELETE FROM dist1 ; ERROR: connection error: wrkhost1:5432 DETAIL: could not connect to server: Connection refused Is the server running on host "workhost1" (192.168.1.101) and accepting TCP/IP connections on port 5432? SQL which operates other than the data of the stopped node is warned but can be executed postgres=> SELECT COUNT(*) FROM dist1 WHERE key1 = 100 ; WARNING: connection error: wrkhost1:5432 DETAIL: could not send data to server: Connection refused Could not send startup packet: Connection refused …
  • 42. Behavior when trouble occurs Worker Node down 42 Warnings are output for records with replicas, but updatable • Maintenance of replica of stopped Worker Node is not maintained postgres=> DELETE FROM dist1 WHERE key1 = 1 ; WARNING: connection error: wrkhost1:5432 DETAIL: could not send data to server: Connection refused could not send SSL negotiation packet: Connection refused DELETE 1
  • 43. Behavior when trouble occurs Combined with streaming replication 43 Streaming replication environment for load balancing is available • Create streaming replication environment for all nodes • Set citus.use_secondary_nodes = 'always' for Standby instance of Coordinator Node • SELECT statement is executed on the standby instance of the Worker Node • Standby instance is registered in master_add_secondary_node function Worker#1 Worker#2Client Primary Coordinator Worker#1 Worker#2Standby Worker#3 Client Standby Coordinator Primary Worker#3 Streaming Replication citus.use_secondary_nodes = 'always'
  • 45. Summary Some restriction exists, but easy to scale-out 45 It is relatively easy to build a scale-out environment Possibility of performance improvement by parallel query + partitioning across nodes It is necessary to implement of fault tolerance and backup by yourself Because restrictions of SQL statement exist, prior application verification is recommended Please note the difference from Enterprise Edition
  • 46. Summary Reference information URL 46 Product manuals https://docs.citusdata.com/en/v8.0/ Performance comparison video https://www.youtube.com/watch?v=g3H4nGsJsl0 Getting Started https://docs.citusdata.com/en/v8.0/portals/getting_started.html Use Cases https://docs.citusdata.com/en/v8.0/portals/use_cases.html API / Reference https://docs.citusdata.com/en/v8.0/portals/reference.html GitHub https://github.com/citusdata/citus