SlideShare a Scribd company logo
1 of 38
Download to read offline
PostgreSQL, The Big, The Fast
and The (NOSQL on ) Acid
Federico Campoli
21 May 2015
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 1 / 34
Table of contents
1 The Big
2 The Fast
3 The (NOSQL on) Acid
4 Wrap up
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 2 / 34
Table of contents
1 The Big
2 The Fast
3 The (NOSQL on) Acid
4 Wrap up
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 3 / 34
The Big
Image by Caitlin - https://www.flickr.com/photos/lizard queen
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 4 / 34
PostgreSQL, an history of excellence
Created at Berkeley in 1982 by database’s legend Prof. Stonebraker
In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter
In the 1996 becomes an Open Source project.
The project’s name changes in PostgreSQL
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
PostgreSQL, an history of excellence
Created at Berkeley in 1982 by database’s legend Prof. Stonebraker
In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter
In the 1996 becomes an Open Source project.
The project’s name changes in PostgreSQL
Fully ACID compliant
High performance in read/write with the MVCC
Tablespaces
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
PostgreSQL, an history of excellence
Created at Berkeley in 1982 by database’s legend Prof. Stonebraker
In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter
In the 1996 becomes an Open Source project.
The project’s name changes in PostgreSQL
Fully ACID compliant
High performance in read/write with the MVCC
Tablespaces
Runs on almost any unix flavour
From the version 8.0 is native on *cough* MS Windows *cough*
HA with hot standby and streaming replication
Heterogeneous federation
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
PostgreSQL, an history of excellence
Created at Berkeley in 1982 by database’s legend Prof. Stonebraker
In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter
In the 1996 becomes an Open Source project.
The project’s name changes in PostgreSQL
Fully ACID compliant
High performance in read/write with the MVCC
Tablespaces
Runs on almost any unix flavour
From the version 8.0 is native on *cough* MS Windows *cough*
HA with hot standby and streaming replication
Heterogeneous federation
Procedural languages (pl/pgsql, pl/python, pl/perl...)
Support for NOSQL features like HSTORE and JSON
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
Development
Old ugly C language
New development cycle starts usually in June
New version released usually by the end of the year
At least 4 LTS versions
Can be extended using shared libraries
Extensions (from the version9.1)
BSD like license
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 6 / 34
Limits
Database size. No limits.
Table size, 32 TB
Row size 1.6 TB
Rows in table. No limits.
Fields in table 250 - 1600 depending on data type.
Tables in a database. No limits.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 7 / 34
Data types
Alongside the general purpose data types PostgreSQL have some exotic types.
Range (integers, date)
Geometric (points, lines etc.)
Network addresses
XML
JSON
HSTORE (extension)
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 8 / 34
Table of contents
1 The Big
2 The Fast
3 The (NOSQL on) Acid
4 Wrap up
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 9 / 34
The Fast
Image by Hein Waschefort -
http://commons.wikimedia.org/wiki/User:Hein waschefort
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 10 / 34
Page layout
A PostgreSQL’s data file is an array of fixed length blocks called pages. The
default size is 8kb.
Figure : Index page
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 11 / 34
Page layout
Each page have an header used to enforce the durability, and the optional page’s
checksum. There are some pointers used to track the free space inside the page.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 12 / 34
Tuple layout
Just after the header there is a list of pointers to the physical tuples stored in the
page’s end. Each tuple is and array of raw data, called datum. The nature of this
datum is unknown to the postgres process. The datum becomes the data type
when PostgreSQL loads the page in memory. This requires a system catalogue
look up.
Figure : Tuple structure
The tuple’s header is used in the MVCC.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 13 / 34
The magic of the MVCC
Any operation in PostgreSQL happens through transactions.
By default when a single statement is successfully completed the database
commits automatically the transaction.
It’s possible to wrap multiple statements in a single transaction using the
keywords [BEGIN;]....... [COMMIT; ROLLBACK]
The minimal possible level the transaction isolation is READ COMMITTED.
PostgreSQL from 9.2 supports the snapshot export to other sessions.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 14 / 34
Consistency
The PostgreSQL’s consistency is achieved using the Multi Version Concurrency
Control (MVCC).
Its logic is theoretically simple.
A 4 byte unsigned integer called xid is incremented by 1 and assigned to the
current transaction.
The committed xid which value is lesser than the current xid are in the past
and then visible to the current transaction.
The xid greater than the current xid are in the future and then invisible to
the current session.
The commit status is managed in the $PGDATA using the directory pg clog
and pg serial where small 8k files are used to tracks the transaction statuses.
The the xid match is performed using the fields t xmin,t xmax in the tuple’s
header.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 15 / 34
Tuple’s header
t xmin is set to the xid generated at tuple insert
t xmax is set to the xid generated at tuple delete
t cid is used to track the command’s sequence inside the same transaction
t cid basically solves the Halloween problem, where an update operation causes a
change in the physical location of a row, potentially allowing the row to be visited
more than once during the operation.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 16 / 34
Tuple’s header
t xmin is set to the xid generated at tuple insert
t xmax is set to the xid generated at tuple delete
t cid is used to track the command’s sequence inside the same transaction
t cid basically solves the Halloween problem, where an update operation causes a
change in the physical location of a row, potentially allowing the row to be visited
more than once during the operation.
But there’s something missing, isn’t it?
Where is the field to store the UPDATE xid?
Figure : Tuple structure
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 16 / 34
There’s no such thing like an update
Well, PostgreSQL actually NEVER performs an update.
When an UPDATE statement is issued the updated rows are inserted with t xmin
set to the current XID value.
The old rows versions are marked as dead writing the t xmax field with the
current transaction id.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 17 / 34
Dead tuples and VACUUM
A dead tuple is left in place for any transaction that should see it. This adds
overhead to any I/O operation.
VACUUM clears the dead tuples
VACUUM is designed to have the minimal impact on the database normal
activity
VACUUM removes only dead tuples no longer visible to the open transactions
Running VACUUM on the entire cluster at least every 2 billions transactions
is compulsory
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 18 / 34
Failure is not an option
XID is a 4 byte unsigned integer.
Every 4 billions transactions the value wraps
PostgreSQL uses the modulo − 231
comparison method
For each value 2 billions XID are in the future and 2 billions are in the past
When a xid’s age becomes too close to 2 billions VACUUM freezes the xmin
value to an hardcoded xid forever in the past
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 19 / 34
Failure is not an option
If for any reason an xid reaches 10 millions transactions from the wraparound
failure the database starts emitting scary messages
WARNING: database "mydb" must be vacuumed within 177009986 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in "mydb".
If a xid’s age reaches 1 million transactions from the wraparound failure
PostgreSQL shuts down and doesn’t restarts. When this happens the only option
to get back the cluster running is to use the single user mode and perform a
cluster’s VACUUM.
Anyway, the autovacuum daemon, even if is turned off, take care of the
problematic relations long before this catastrophic scenario happens.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 20 / 34
Table of contents
1 The Big
2 The Fast
3 The (NOSQL on) Acid
4 Wrap up
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 21 / 34
The (NOSQL on) Acid
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 22 / 34
JSON
JSON - JavaScript Object Notation
The version 9.2 adds JSON as native data type
The version 9.3 adds the support functions for JSON
JSON is stored as text
JSON is parsed and validated on the fly
The 9.4 adds JSONB (binary) data type
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 23 / 34
JSON
JSON - Examples
From record to JSON
postgres=# SELECT row_to_json(ROW(1,’foo’));
row_to_json
---------------------
{"f1":1,"f2":"foo"}
(1 row)
Expanding JSON into key to value elements
postgres=# SELECT * from json_each(’{"a":"foo", "b":"bar"}’);
key | value
-----+-------
a | "foo"
b | "bar"
(2 rows)
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 24 / 34
HSTORE
HSTORE is a custom data type used to store key to value items
Is an extension
Data stored as text
A shared library does the magic transforming the datum in HSTORE
Is similar to JSON without nested elements
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 25 / 34
HSTORE
HSTORE - Examples
From record to HSTORE
postgres=# SELECT hstore(ROW(1,2));
hstore
----------------------
"f1"=>"1", "f2"=>"2"
(1 row)
HSTORE expansion to key to value elements
postgres=# SELECT * FROM each(’a=>1,b=>2’);
key | value
-----+-------
a | 1
b | 2
(2 rows)
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 26 / 34
JSON and HSTORE
There is a subtile difference between HSTORE and JSON. HSTORE is not a
native data type.
The JSON is a native data type and the conversion happens inside the postgres
process.
The HSTORE requires the access to the shared library.
Because the conversion from the raw datum happens for each tuple loaded in the
shared buffer can affect the performance’s overall.
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 27 / 34
JSONB
Because JSON is parsed and validated on the fly and and this can be a bottleneck.
The new JSONB introduced with PostgreSQL 9.4 is parsed, validated and
transformed at insert/update’s time. The access is then faster than the plain
JSON but the storage cost can be higher.
The functions available for JSON are working with JSONB
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 28 / 34
Table of contents
1 The Big
2 The Fast
3 The (NOSQL on) Acid
4 Wrap up
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 29 / 34
Wrap up
Schema less data are useful. They are flexible and powerful.
Ignoring the MVCC implementation lead to disasters.
The lack of horizontal scalability in PostgreSQL can be a serious problem.
An interesting project for a distributed cluster is PostgreSQL XL -
http://www.postgres-xl.org/
Unfortunately at the date of writing is still in RC
Never forget PostgreSQL is a RDBMS
Get a DBA on board
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 30 / 34
Questions
Questions?
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 31 / 34
Contacts
Twitter: 4thdoctor scarf
PostgreSQL’s blog: http://www.pgdba.co.uk
PostgreSQL Book:
http://www.slideshare.net/FedericoCampoli/postgresql-dba-01
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 32 / 34
License and copyright
This presentation is licensed under the terms of the Creative Commons
Attribution NonCommercial ShareAlike 4.0
http://creativecommons.org/licenses/by-nc-sa/4.0/
The elephant photo is copyright by Caitlin -
https://www.flickr.com/photos/lizard queen
The cheetah photo is copyright by Hein Waschefort -
http://commons.wikimedia.org/wiki/User:Hein waschefort
The elephant logos are copyright of the PostgreSQL Global Development
Group - http://www.postgresql.org/
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 33 / 34
PostgreSQL, The Big, The Fast
and The (NOSQL on ) Acid
Federico Campoli
21 May 2015
Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 34 / 34

More Related Content

What's hot

Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
Mark Wong
 

What's hot (20)

Pg big fast ugly acid
Pg big fast ugly acidPg big fast ugly acid
Pg big fast ugly acid
 
The ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseThe ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in Transwerwise
 
Hitchikers guide handout
Hitchikers guide handoutHitchikers guide handout
Hitchikers guide handout
 
PostgreSql query planning and tuning
PostgreSql query planning and tuningPostgreSql query planning and tuning
PostgreSql query planning and tuning
 
Don't panic! - Postgres introduction
Don't panic! - Postgres introductionDon't panic! - Postgres introduction
Don't panic! - Postgres introduction
 
Streaming replication
Streaming replicationStreaming replication
Streaming replication
 
pg_chameleon a MySQL to PostgreSQL replica
pg_chameleon a MySQL to PostgreSQL replicapg_chameleon a MySQL to PostgreSQL replica
pg_chameleon a MySQL to PostgreSQL replica
 
pg_chameleon MySQL to PostgreSQL replica made easy
pg_chameleon  MySQL to PostgreSQL replica made easypg_chameleon  MySQL to PostgreSQL replica made easy
pg_chameleon MySQL to PostgreSQL replica made easy
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Our answer to Uber
Our answer to UberOur answer to Uber
Our answer to Uber
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second eraOpen Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
 
SFScon 2020 - Matteo Ghetta - DataPlotly - D3-like plots in QGIS
SFScon 2020 - Matteo Ghetta - DataPlotly - D3-like plots in QGISSFScon 2020 - Matteo Ghetta - DataPlotly - D3-like plots in QGIS
SFScon 2020 - Matteo Ghetta - DataPlotly - D3-like plots in QGIS
 
Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
 
NoSQL and Triple Stores
NoSQL and Triple StoresNoSQL and Triple Stores
NoSQL and Triple Stores
 
CPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPANCPANci: Continuous Integration for CPAN
CPANci: Continuous Integration for CPAN
 
Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Week7 bean life cycle
Week7 bean life cycleWeek7 bean life cycle
Week7 bean life cycle
 
ADAM
ADAMADAM
ADAM
 

Viewers also liked

Menopause nine by sanjana
Menopause nine by sanjana Menopause nine by sanjana
Menopause nine by sanjana
sanjukpt92
 
Catálogo de bolsas Chenson - Cristal cosmetic
Catálogo de bolsas Chenson - Cristal cosmeticCatálogo de bolsas Chenson - Cristal cosmetic
Catálogo de bolsas Chenson - Cristal cosmetic
catalagocristalcosmetic
 
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex ShopSISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
Marcio Lampert
 

Viewers also liked (15)

Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Menopause nine by sanjana
Menopause nine by sanjana Menopause nine by sanjana
Menopause nine by sanjana
 
Su strade final
Su strade finalSu strade final
Su strade final
 
Media planning
Media planningMedia planning
Media planning
 
καστορια
καστοριακαστορια
καστορια
 
Catálogo de bolsas Chenson - Cristal cosmetic
Catálogo de bolsas Chenson - Cristal cosmeticCatálogo de bolsas Chenson - Cristal cosmetic
Catálogo de bolsas Chenson - Cristal cosmetic
 
Vacuum precision positioning systems brochure
Vacuum precision positioning systems brochureVacuum precision positioning systems brochure
Vacuum precision positioning systems brochure
 
MuCEM
MuCEMMuCEM
MuCEM
 
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex ShopSISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
SISTEMA DE IDENTIDADE VISUAL - SIV - CANNIBAL Sex Shop
 
Nclex 5
Nclex 5Nclex 5
Nclex 5
 
Media planning f
Media planning fMedia planning f
Media planning f
 
My favorite movember mustaches
My favorite movember mustachesMy favorite movember mustaches
My favorite movember mustaches
 
Cl teleflostrainers
Cl teleflostrainersCl teleflostrainers
Cl teleflostrainers
 
PostgreSQL, The Big, The Fast and The Ugly
PostgreSQL, The Big, The Fast and The UglyPostgreSQL, The Big, The Fast and The Ugly
PostgreSQL, The Big, The Fast and The Ugly
 
Is Display Advertising Still a Healthy Form of Marketing?
Is Display Advertising Still a Healthy Form of Marketing?Is Display Advertising Still a Healthy Form of Marketing?
Is Display Advertising Still a Healthy Form of Marketing?
 

Similar to PostgreSQL, the big the fast and the (NOSQL on) Acid

Similar to PostgreSQL, the big the fast and the (NOSQL on) Acid (20)

Softshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with CouchbaseSoftshake 2013: Introduction to NoSQL with Couchbase
Softshake 2013: Introduction to NoSQL with Couchbase
 
コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!コミュニティ開発に参加しよう!
コミュニティ開発に参加しよう!
 
Наш ответ Uber’у
Наш ответ Uber’уНаш ответ Uber’у
Наш ответ Uber’у
 
Spark Streaming Intro @KTech
Spark Streaming Intro @KTechSpark Streaming Intro @KTech
Spark Streaming Intro @KTech
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Comparison between rdbms and nosql
Comparison between rdbms and nosqlComparison between rdbms and nosql
Comparison between rdbms and nosql
 
Gossip & Key Value Store
Gossip & Key Value StoreGossip & Key Value Store
Gossip & Key Value Store
 
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
 
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
 
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
 
Operating and Supporting Delta Lake in Production
Operating and Supporting Delta Lake in ProductionOperating and Supporting Delta Lake in Production
Operating and Supporting Delta Lake in Production
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
 
Go 1.8 Release Party
Go 1.8 Release PartyGo 1.8 Release Party
Go 1.8 Release Party
 
Using Elasticsearch as the Primary Data Store
Using Elasticsearch as the Primary Data StoreUsing Elasticsearch as the Primary Data Store
Using Elasticsearch as the Primary Data Store
 
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes elevenPGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
 
The computer science behind a modern disributed data store
The computer science behind a modern disributed data storeThe computer science behind a modern disributed data store
The computer science behind a modern disributed data store
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

PostgreSQL, the big the fast and the (NOSQL on) Acid

  • 1. PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid Federico Campoli 21 May 2015 Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 1 / 34
  • 2. Table of contents 1 The Big 2 The Fast 3 The (NOSQL on) Acid 4 Wrap up Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 2 / 34
  • 3. Table of contents 1 The Big 2 The Fast 3 The (NOSQL on) Acid 4 Wrap up Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 3 / 34
  • 4. The Big Image by Caitlin - https://www.flickr.com/photos/lizard queen Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 4 / 34
  • 5. PostgreSQL, an history of excellence Created at Berkeley in 1982 by database’s legend Prof. Stonebraker In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter In the 1996 becomes an Open Source project. The project’s name changes in PostgreSQL Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
  • 6. PostgreSQL, an history of excellence Created at Berkeley in 1982 by database’s legend Prof. Stonebraker In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter In the 1996 becomes an Open Source project. The project’s name changes in PostgreSQL Fully ACID compliant High performance in read/write with the MVCC Tablespaces Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
  • 7. PostgreSQL, an history of excellence Created at Berkeley in 1982 by database’s legend Prof. Stonebraker In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter In the 1996 becomes an Open Source project. The project’s name changes in PostgreSQL Fully ACID compliant High performance in read/write with the MVCC Tablespaces Runs on almost any unix flavour From the version 8.0 is native on *cough* MS Windows *cough* HA with hot standby and streaming replication Heterogeneous federation Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
  • 8. PostgreSQL, an history of excellence Created at Berkeley in 1982 by database’s legend Prof. Stonebraker In the 1994 Andrew Yu and Jolly Chen added the SQL interpreter In the 1996 becomes an Open Source project. The project’s name changes in PostgreSQL Fully ACID compliant High performance in read/write with the MVCC Tablespaces Runs on almost any unix flavour From the version 8.0 is native on *cough* MS Windows *cough* HA with hot standby and streaming replication Heterogeneous federation Procedural languages (pl/pgsql, pl/python, pl/perl...) Support for NOSQL features like HSTORE and JSON Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 5 / 34
  • 9. Development Old ugly C language New development cycle starts usually in June New version released usually by the end of the year At least 4 LTS versions Can be extended using shared libraries Extensions (from the version9.1) BSD like license Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 6 / 34
  • 10. Limits Database size. No limits. Table size, 32 TB Row size 1.6 TB Rows in table. No limits. Fields in table 250 - 1600 depending on data type. Tables in a database. No limits. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 7 / 34
  • 11. Data types Alongside the general purpose data types PostgreSQL have some exotic types. Range (integers, date) Geometric (points, lines etc.) Network addresses XML JSON HSTORE (extension) Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 8 / 34
  • 12. Table of contents 1 The Big 2 The Fast 3 The (NOSQL on) Acid 4 Wrap up Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 9 / 34
  • 13. The Fast Image by Hein Waschefort - http://commons.wikimedia.org/wiki/User:Hein waschefort Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 10 / 34
  • 14. Page layout A PostgreSQL’s data file is an array of fixed length blocks called pages. The default size is 8kb. Figure : Index page Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 11 / 34
  • 15. Page layout Each page have an header used to enforce the durability, and the optional page’s checksum. There are some pointers used to track the free space inside the page. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 12 / 34
  • 16. Tuple layout Just after the header there is a list of pointers to the physical tuples stored in the page’s end. Each tuple is and array of raw data, called datum. The nature of this datum is unknown to the postgres process. The datum becomes the data type when PostgreSQL loads the page in memory. This requires a system catalogue look up. Figure : Tuple structure The tuple’s header is used in the MVCC. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 13 / 34
  • 17. The magic of the MVCC Any operation in PostgreSQL happens through transactions. By default when a single statement is successfully completed the database commits automatically the transaction. It’s possible to wrap multiple statements in a single transaction using the keywords [BEGIN;]....... [COMMIT; ROLLBACK] The minimal possible level the transaction isolation is READ COMMITTED. PostgreSQL from 9.2 supports the snapshot export to other sessions. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 14 / 34
  • 18. Consistency The PostgreSQL’s consistency is achieved using the Multi Version Concurrency Control (MVCC). Its logic is theoretically simple. A 4 byte unsigned integer called xid is incremented by 1 and assigned to the current transaction. The committed xid which value is lesser than the current xid are in the past and then visible to the current transaction. The xid greater than the current xid are in the future and then invisible to the current session. The commit status is managed in the $PGDATA using the directory pg clog and pg serial where small 8k files are used to tracks the transaction statuses. The the xid match is performed using the fields t xmin,t xmax in the tuple’s header. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 15 / 34
  • 19. Tuple’s header t xmin is set to the xid generated at tuple insert t xmax is set to the xid generated at tuple delete t cid is used to track the command’s sequence inside the same transaction t cid basically solves the Halloween problem, where an update operation causes a change in the physical location of a row, potentially allowing the row to be visited more than once during the operation. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 16 / 34
  • 20. Tuple’s header t xmin is set to the xid generated at tuple insert t xmax is set to the xid generated at tuple delete t cid is used to track the command’s sequence inside the same transaction t cid basically solves the Halloween problem, where an update operation causes a change in the physical location of a row, potentially allowing the row to be visited more than once during the operation. But there’s something missing, isn’t it? Where is the field to store the UPDATE xid? Figure : Tuple structure Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 16 / 34
  • 21. There’s no such thing like an update Well, PostgreSQL actually NEVER performs an update. When an UPDATE statement is issued the updated rows are inserted with t xmin set to the current XID value. The old rows versions are marked as dead writing the t xmax field with the current transaction id. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 17 / 34
  • 22. Dead tuples and VACUUM A dead tuple is left in place for any transaction that should see it. This adds overhead to any I/O operation. VACUUM clears the dead tuples VACUUM is designed to have the minimal impact on the database normal activity VACUUM removes only dead tuples no longer visible to the open transactions Running VACUUM on the entire cluster at least every 2 billions transactions is compulsory Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 18 / 34
  • 23. Failure is not an option XID is a 4 byte unsigned integer. Every 4 billions transactions the value wraps PostgreSQL uses the modulo − 231 comparison method For each value 2 billions XID are in the future and 2 billions are in the past When a xid’s age becomes too close to 2 billions VACUUM freezes the xmin value to an hardcoded xid forever in the past Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 19 / 34
  • 24. Failure is not an option If for any reason an xid reaches 10 millions transactions from the wraparound failure the database starts emitting scary messages WARNING: database "mydb" must be vacuumed within 177009986 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in "mydb". If a xid’s age reaches 1 million transactions from the wraparound failure PostgreSQL shuts down and doesn’t restarts. When this happens the only option to get back the cluster running is to use the single user mode and perform a cluster’s VACUUM. Anyway, the autovacuum daemon, even if is turned off, take care of the problematic relations long before this catastrophic scenario happens. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 20 / 34
  • 25. Table of contents 1 The Big 2 The Fast 3 The (NOSQL on) Acid 4 Wrap up Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 21 / 34
  • 26. The (NOSQL on) Acid Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 22 / 34
  • 27. JSON JSON - JavaScript Object Notation The version 9.2 adds JSON as native data type The version 9.3 adds the support functions for JSON JSON is stored as text JSON is parsed and validated on the fly The 9.4 adds JSONB (binary) data type Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 23 / 34
  • 28. JSON JSON - Examples From record to JSON postgres=# SELECT row_to_json(ROW(1,’foo’)); row_to_json --------------------- {"f1":1,"f2":"foo"} (1 row) Expanding JSON into key to value elements postgres=# SELECT * from json_each(’{"a":"foo", "b":"bar"}’); key | value -----+------- a | "foo" b | "bar" (2 rows) Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 24 / 34
  • 29. HSTORE HSTORE is a custom data type used to store key to value items Is an extension Data stored as text A shared library does the magic transforming the datum in HSTORE Is similar to JSON without nested elements Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 25 / 34
  • 30. HSTORE HSTORE - Examples From record to HSTORE postgres=# SELECT hstore(ROW(1,2)); hstore ---------------------- "f1"=>"1", "f2"=>"2" (1 row) HSTORE expansion to key to value elements postgres=# SELECT * FROM each(’a=>1,b=>2’); key | value -----+------- a | 1 b | 2 (2 rows) Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 26 / 34
  • 31. JSON and HSTORE There is a subtile difference between HSTORE and JSON. HSTORE is not a native data type. The JSON is a native data type and the conversion happens inside the postgres process. The HSTORE requires the access to the shared library. Because the conversion from the raw datum happens for each tuple loaded in the shared buffer can affect the performance’s overall. Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 27 / 34
  • 32. JSONB Because JSON is parsed and validated on the fly and and this can be a bottleneck. The new JSONB introduced with PostgreSQL 9.4 is parsed, validated and transformed at insert/update’s time. The access is then faster than the plain JSON but the storage cost can be higher. The functions available for JSON are working with JSONB Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 28 / 34
  • 33. Table of contents 1 The Big 2 The Fast 3 The (NOSQL on) Acid 4 Wrap up Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 29 / 34
  • 34. Wrap up Schema less data are useful. They are flexible and powerful. Ignoring the MVCC implementation lead to disasters. The lack of horizontal scalability in PostgreSQL can be a serious problem. An interesting project for a distributed cluster is PostgreSQL XL - http://www.postgres-xl.org/ Unfortunately at the date of writing is still in RC Never forget PostgreSQL is a RDBMS Get a DBA on board Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 30 / 34
  • 35. Questions Questions? Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 31 / 34
  • 36. Contacts Twitter: 4thdoctor scarf PostgreSQL’s blog: http://www.pgdba.co.uk PostgreSQL Book: http://www.slideshare.net/FedericoCampoli/postgresql-dba-01 Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 32 / 34
  • 37. License and copyright This presentation is licensed under the terms of the Creative Commons Attribution NonCommercial ShareAlike 4.0 http://creativecommons.org/licenses/by-nc-sa/4.0/ The elephant photo is copyright by Caitlin - https://www.flickr.com/photos/lizard queen The cheetah photo is copyright by Hein Waschefort - http://commons.wikimedia.org/wiki/User:Hein waschefort The elephant logos are copyright of the PostgreSQL Global Development Group - http://www.postgresql.org/ Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 33 / 34
  • 38. PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid Federico Campoli 21 May 2015 Federico Campoli PostgreSQL, The Big, The Fast and The (NOSQL on ) Acid 21 May 2015 34 / 34