This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014.
http://www.meetup.com/nylug-meetings/events/180533472/
2. Who Are We?
● Jim Mlodgenski
– jimm@openscg.com
– @jim_mlodgenski
● Co-organizer of
– NYCPUG - www.nycpug.org
● Director, PgUS
– www.postgresql.us
● CTO, OpenSCG
– www.openscg.com
● Jonathan S. Katz
– jonathan@venuebook.com
– @jkatz05
● Co-organizer of
– NYCPUG - www.nycpug.org
● Director, PgUS
– www.postgresql.us
● CTO, VenueBook
– www.venuebook.com
3.
4. History
● The world’s most advanced open source database
● Designed for extensibility and customization
● ANSI/ISO compliant SQL support
● Actively developed for almost 30 years
– University POSTGRES (1986-1993)
– Postgres95 (1994-1995)
– PostgreSQL (1996-2014)
5. Timeline
“Over the past few years,
PostgreSQL has become the
preferred open source
relational database for many
enterprise developers and
start-ups, powering leading
geospatial and mobile
applications.”
– Jeff Barr, Chief Evangelist,
Amazon Web Services
7. Technology
● Full Featured Database
– Mature Server Side Programming Functionality
– Hot Standby High Availability
– Online Backups
– Point In Time Recovery
– Table Partitioning
– Spatial Functionality
– Full Text Search
8. Security
● Object Level Privileges assigned to “Roles & User”
● Row Level Security
● Many Authentication mechanisms
– Kerberos
– LDAP
– PAM
– GSSAPI
● Native SSL Support.
● Data Level Encryption (AES, 3DES, etc)
● Ability to utilize 3rd party Key Stores in a full PKI
infrastructure
9. Flexibility
● No Vendor Lock-in
– Compliant with the ANSI SQL standard
– Runs on all major platforms using all major languages and middleware
● “BSD-like” license – PostgreSQL License
– Allows businesses to retain the option of commercializing the final product
with minimal legal issues
– No fear of “Open Source Viral Infection”
10. Predictability
● Predictable release cycles
– The average span between major
releases over the last 10 years is 13
months
● Quick turn around on patches
– The average span between minor
releases over the last 5 years is 3
months
Version Release Date
7.3 Nov-02
7.4 Nov-03
8.0 Jan-05
8.1 Nov-05
8.2 Dec-06
8.3 Feb-08
8.4 Jul-09
9.0 Aug-10
9.1 Sep-11
9.2 Sep-12
9.3 Sep-13
11. Community
● Strong Open Source Community
● Independent & Thriving Development Community
– 10+ committers and ~200 reviewers
– 1,500 contributors and 10,000+ members
● Millions of downloads per year
● PostgreSQL is a meritocracy
– Influence through their merits (usually technical) of the contributor
13. PostgreSQL Success Stories
“…With PostgreSQL we have been successful in growing the databases as the company
has grown, both in number of users and in the complexity of services we offer…”
Hannu Krosing – Database Architect Skye Technologies.
“We manage multiple terabytes of data in more than 50 unique production PostgreSQL
databases.”
Cisco uses PostgreSQL as the embedded database in all its “Case Sensitive Routing”
(CSR) products to store carrier details, rules, contacts, routes – to perform call routing.
“…Fujitsu is proud of its sponsorship of contributions to PostgreSQL and of its work with
The PostgreSQL community. We are committed to helping make PostgreSQL the leading
Database Management System…”
Takayuki Nakazawa – Director Database in Software Group.
14. Database 101
● A database stores data
● Clients ( people or applications ) input data into tables
( relations ) in the database and retrieve data from it
● Relational Database Management Systems are responsible
for managing the safe-storage of data
● RDBMSs are designed to store data in an A.C.I.D compliant
way ( all or nothing )
– This is done via transactions
15. Database 101 - (ACID)
● Atomic
– Store data in an 'all-or-nothing' approach
● Consistent
– Give me a consistent picture of the data
● Isolated
– Prevent concurrent data access from causing me woe
● Durable
– When I say 'COMMIT;' the data, make sure it is safe until I explicitly destroy it
16. Database 101 - (Transactions)
● All or nothing
● A transaction has
– A Beginning ( BEGIN; )
– Work ( multiple lines of SQL, i.e. INSERT / UPDATE / DELETE)
– An Ending ( END; ) You would expect one of two cases
● COMMIT; ( save everything )
● ROLLBACK; ( undo all changes, save nothing)
– Once the transaction has ended, it will either make ALL of the changes
between BEGIN; and COMMIT; or NONE of them ( if there is an error for
example )
17. PostgreSQL 101
● PostgreSQL meets all of the requirements to be a fully
ACID-compliant, transactional database.
● PostgreSQL RDBMS serves a cluster aka an instance.
– An instance serves one ( and only one ) TCP/IP port
– Contains at least one database
– Has an associated data-directory
18. Major Features
● Full network client-server architecture
● ACID compliant
● Transactional ( uses WAL / REDO )
● Partitioning
● Tiered storage via tablespaces
● Multiversion Concurrency Control
( readers don't block writers )
● On-line maintenance operations
● Hot ( readonly ) and Warm ( quick-promote
) standby
● Log-based and trigger based replication
● SSL
● Full-text search
● Procedural languages
– Pl/pgSQL plus other, custom languages
19. General Limitations
Limit Value
Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows / Table Unlimited
Maximum Columns / Table 250-1600
Maximum Indexes / Table Unlimited
26. Data Types
● Building blocks of a schema
● Optimized on-disk format for a specific type of data
● PostgreSQL provides:
– Wide array (no pun intended) of basic to complex data types
– Functional interfaces for ease of manipulation
– Ability to extend and create custom data types
27. Number Types
Name Storage Size Range
smallint 2 bytes -32768 to +32767
integer 4 bytes -2147483648 to +2147483647
bigint 8 bytes -9223372036854775808 to 9223372036854775807
decimal variable up to 131072 digits before the decimal point; up to
16383 digits after the decimal point
numeric variable up to 131072 digits before the decimal point; up to
16383 digits after the decimal point
real 4 bytes 6 decimal digits precision
double 8 bytes 15 decimal digits precision
28. Character Types
Name Description
varchar(n) variable-length with limit
char(n) fixed-length, blank padded
text variable unlimited length
29. Date/Time Types
Name Size Range Resolution
timestamp
without
timezone
8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits
timestamp with
timezone
8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits
date 4 bytes 4713 BC to 5874897 AD 1 day
time without
timezone
8 bytes 00:00:00 to 24:00:00 1 microsecond / 14 digits
time with
timezone
12 bytes 00:00:00+1459 to
24:00:00-1459
1 microsecond / 14 digits
interval 12 bytes -178000000 years to
178000000 years
1 microsecond / 14 digits
30. Specialized Types
Name Storage Size Range
boolean 1 byte false to true
smallserial 2 bytes 1 to 32767
serial 4 bytes 1 to 2147483647
bigserial 8 bytes 1 to 9223372036854775807
bytea 1 to 4 bytes plus size of
binary string
variable-length binary string
cidr 7 or 19 bytes IPv4 or IPv6 networks
inet 7 or 19 bytes IPv4 or IPv6 hosts or networks
macaddr 6 bytes MAC addresses
uuid 16 bytes Universally Unique Identifiers
31. “Schema-less” Types
Name Description
xml stores XML data and checks the input values for well-formedness
hstore stores sets of key/value pairs
json stores an exact copy of the input JSON document
jsonb stores a decomposed binary format of the input JSON
document
32. Range Types
● Represents a range of an element type
– Integers
– Numerics
– Times
– Dates
– And more...
33. Range Types
CREATE TABLE travel_log (
id serial PRIMARY KEY,
name varchar(255),
travel_range daterange,
EXCLUDE USING gist (travel_range WITH &&)
);
INSERT INTO travel_log (name, trip_range) VALUES ('Chicago', daterange('2012-03-12', '2012-03-17'));
INSERT INTO travel_log (name, trip_range) VALUES ('Austin', daterange('2012-03-16', '2012-03-18'));
ERROR: conflicting key value violates exclusion constraint "travel_log_trip_range_excl"
DETAIL: Key (trip_range)=([2012-03-16,2012-03-18)) conflicts with existing key (trip_range)=([2012-03-
12,2012-03-17)).
34. Indexes
● Enhances database
performance
● Enforces some types of
constraints
– Uniqueness
– Exclusion
35. Index Types
● B-Tree
● Generalized Inverted Index (GIN)
● Generalized Search Tree (GIST)
● Space-Partitoned Generalized Search Tree (SP-GIST)
Coming Soon...
● Block Range Index (BRIN)
● “VODKA”
36. Procedural Languages
● Allows for use defined functionality to be run within the
database
– Used as functions or triggers
● Frequent use cases
– Enhance performance
– Increase security
– Centralize business logic
37. Procedural Language Types
● PL/pgSQL
● PL/Perl
● PL/TCL
● PL/Python
● More available through extensions...
38. Extensions
● Additional modules that can be plugged into PostgreSQL
● Can be used to add a ton of useful features
– Procedural Languages
– Data Types
– Administration Tools
– Foreign Data Wrappers
● Many found in contrib
● Also www.pgxn.org
40. Data Type Extensions
● Hstore
● Case Insensitive Text (citext)
● International Product Numbering Standards (ISN)
● PostGIS (geometry)
● BioPostgres
● SSN
● Email
41. PostGIS
● PostGIS adds OpenGIS Consortium
(OGC) compliant geometry data types
and functions to PostgreSQL
● With PostgreSQL, becomes a best of
breed spatial and raster database
43. What are Foreign Data Wrappers?
● Used with SQL/MED
– New ANIS SQL 2003 Extension
– Management of External Data
– Standard way of handling remote objects in SQL databases
● Wrappers used by SQL/MED to access remotes data
sources
● Makes external data sources look like a PostgreSQL table
45. MongoDB FDW
CREATE SERVER mongo_server FOREIGN DATA WRAPPER
mongo_fdw OPTIONS (address '192.168.122.47', port '27017');
CREATE FOREIGN TABLE databases (
_id NAME,
name TEXT
)
SERVER mongo_server
OPTIONS (database 'mydb', collection 'pgData');
test=# select * from databases ;
_id | name
--------------------------+------------
52fd49bfba3ae4ea54afc459 | mongo
52fd49bfba3ae4ea54afc45a | postgresql
52fd49bfba3ae4ea54afc45b | oracle
52fd49bfba3ae4ea54afc45c | mysql
52fd49bfba3ae4ea54afc45d | redis
52fd49bfba3ae4ea54afc45e | db2
(6 rows)
46. WWW FDW
test=# SELECT * FROM www_fdw_geocoder_google
test-# WHERE address = '731 Lexington Ave, New York, NY';
-[ RECORD 1 ]-----+----------------------------------------------
address |
type | street_address
formatted_address | 731 Lexington Avenue, New York, NY 10022, USA
lat | 40.7619363
lng | -73.9681017
location_type | ROOFTOP
47. PL/Proxy
● Developed by
Skype
● Allows for
scalability and
parallelization
● Uses procedural
languages and
FDWs
48. PostgreSQL Replication
● Replicate to read-only
databases using native
streaming replication
● All writes go to a master
server
● Load balance across the
pool of servers
49. PostgreSQL Scalability
● PostgreSQL scales
up linearly up to 64
cores
● May scale further
but hardware is not
available to the
community
http://rhaas.blogspot.com/2012/04/did-i-say-32-cores-how-about-64.html
50. Getting Help
● Community Mail List
– http://www.postgresql.org/list/
● IRC
– irc://irc.freenode.net/postgresql
● NYC PostgreSQL User Group
– http://www.nycpug.org