2. About @jkshah
ü appOrbit
• Focus is on data management of applications running in Containers
ü VMware
• Lead and manage Postgres and Data Management teams at VMware for various products embedding
PostgreSQL running in virtualized embedded instances
ü Sun Microsystems
• Team Member of first published SpecJAppServer 2004 benchmark with PostgreSQL
• Performance of PostgreSQL on Solaris/Sun Servers
ü Working with PostgreSQL community since 2005
• http://jkshah.blogspot.com/2005/04/profiling-postgresql-using-dtrace-on_22.html
ü Working with Container technologies (Solaris Zones) since 2004
• http://jkshah.blogspot.com/2004/08/db2-working-under-solaris-10-zones_30.html
4. What are Containers?
ü OS Level virtualization where kernel allows for multiple isolated user-
space instances
Operating
System
Bare Metal
Server
OS
Bare Metal
Server
Hypervisor
OS
Operating
System
Bare Metal
Server
C C C C C OS
Bare Metal
Server
Hypervisor
OS
C C C C
6. Disadvantages of Containers
ü Same Kernel version
ü Cannot run other OS natively
ü Security (to be improved)
ü Not a complete solution for enterprise needs
7. Where to use container?
ü Recreate identical environment (cookie-cutter)
ü Resource Grouping of specific processes in heavily loaded server
ü Handling multiple versions of software applications
ü Ephemeral application instances (Dev/Test)
ü Production instances (Growing everyday)
ü Many more
9. Docker
ü Quick Guide to use a docker based container
# docker run --name mycontainer –e
POSTGRES_PASSWORD=mysecretpassword -d postgres
# docker exec -ti mycontainer psql -U postgres
# docker stop mycontainer
# docker rm mycontainer
# docker rmi postgres
10. Container Volumes
ü Persists beyond the life of a Docker container
• VOLUME command in Dockerfile or
• Using –v using docker run command
• Automatically created if not already present during docker run
• Not part of docker push/pull operations
• Can select a non-local directory using --volume-driver
• Third party components required to get multi-host support (NFS, etc )
ü Different options using –v
• -v /hostsrc/data:/opt/data:ro # for read only volumes (default rw)
• -v /hostsrc/data:/opt/data:Z # Z – private volume, z – shared volume
• -v /etc/nginx.conf:/etc/nginx.conf # for mounting a single file only
ü Volumes can be shared from another container using --volumes-from on
same host
ü Starting from docker 1.9 gives first class status to Docker Volumes
11. PostgreSQL Container as a DB server
ü Maybe you want a database server standalone
• Not all database clients will be in the same host
• Need to limit memory usage
• Need different layout of how files are distributed
ü Use the –p option to make the port available even to non containers
clients
ü Use –m to limit memory usage by the DB server (by default it can see
and use all)
• Note this does not set shared buffers automatically with the library image
docker run --name mycontainer -m 4g -e POSTGRES_PASSWORD=mysecretpassword
-v /hostpath/pgdata:/var/lib/postgresql/data -p 5432:5432 -d postgres
12. PostgreSQL in an enterprise environment
ü However for a real production use case we would need
• Bigger shared memory configurations
• Need different layout of how files are distributed
• Ability to backup the database
• Ability to setup replication
• etc
ü In short we need a more custom image of PostgreSQL
13. Best Practices for custom image
ü For production install customize the docker image
• Allocate proper memory limits - example 8GB
• All pagecache usage shows up as docker container memory usage
• Bump up shared buffers and other parameters as required
• Hint: use PostgreSQL 9.3 or later otherwise have to privileged containers
• http://jkshah.blogspot.com/2015/09/is-it-privilege-to-run-container-in.html
• Support multiple volumes in your image
• PITR archives
• Full Backup directory
• PostgreSQL Extensions
• Setup replication support
• Out of box replication setup
• Monitoring Tool
• Your favorite monitoring agent
15. Planning a High Availability Strategy
ü Requirements
• Recovery Time Objective (RTO)
• What does 99.99% availability really mean?
• Recovery Point Objective (RPO)
• Zero data lost?
• HA vs. DR requirements
ü Evaluating a technology
• What’s the cost for implementing the technology?
• What’s the complexity of implementing, and managing the technology?
• What’s the downtime potential?
• What’s the data loss exposure?
Availability % Downtime / Year Downtime / Month * Downtime / week
"Two Nines" - 99% 3.65 Days 7.2 Hours 1.69 Hours
"Three Nines" - 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes
"Four Nines" - 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes
"Five Nines" - 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds
* Using a 30 day month
16. Simplified View of HA PostgreSQL
ü Easy to setup
ü Handles Infrastructure problems
ü Exploit Storage features
ü Exploit replication features
DNS Name
Applications
Somewhere
in Cloud/
Data Center
17. Causes of Downtime
ü Planned Downtime
• Software upgrade (OS patches, SQL Server cumulative updates)
• Hardware/BIOS upgrade
ü Unplanned Downtime
• Datacenter failure (natural disasters, fire)
• Server failure (failed CPU, bad network card)
• I/O subsystem failure (disk failure, controller failure)
• Software/Data corruption (application bugs, OS binary corruptions)
• User Error (shutdown a SQL service, dropped a table)
18. Typical Plan of action
ü Minimize reasons that leads to downtime
ü Faster recovery time (Balanced checkpoints)
ü Proxies for fast switching between production and DR copy
ü Shared Storage for HA
ü PostgreSQL Synchronous Replication to go beyond
19. HA PostgreSQL with Shared Storage
ü Ability to leverage hardware Snapshots/Restore
ü Automated Failover using OS Clustering Software
ü Block Level Replication for DR
ü Distributed Shared Storage getting popular
Virtual IP or
DNS or
pgPool or
pgBouncerApplications
Site 1
20. PostgreSQL Replication
ü Single master, multi-slave
ü Cascading slave also possible
ü Mechanism based on WAL (Write-Ahead Logs)
ü Multiple modes and multiple recovery ways
• Warm standby
• Asynchronous hot standby
• Synchronous hot standby
ü Slaves can perform read operations optionally
• Good for read scale
ü Node failover, reconnection possible
21. HA PostgreSQL with Sync Replication
ü Synchronous Replication within Data Center
ü Low Down Time (lower than HA)
ü Automated Failover for hardware issues including Storage
Virtual IP or
DNS or
pgPool or
pgBouncer
Applications
Site 1
22. PostgreSQL Replication
ü In-core replication does great replication
• But no automated failover
• “failback” (pg_rewind – thank god)
• Load Balanced IP Address
• Get your own proxy (haproxy ?,
pgbouncer?, pgpool?)
• No-way to preserve connections
Photo Credit: dundanim/ Shutterstock.com
23. Just PostgreSQL?
ü Need more projects
• pgPool2 / HAProxy /pgbouncer
• Repmgr, etc
ü Some Customers at this time prefer Cloud DBaaS
• Heroku
• Amazon RDS
ü Some end up preferring Enterprise version of DBaaS
• appOrbit J
24. Modern HA Projects
ü Patroni / Governor
• https://github.com/zalando/patroni (Python)
• Docker container
• Etcd
• HAProxy
ü Stolon
• https://github.com/sorintlab/stolon (Golang)
• Docker
• Etcd /Consul
• Custom Proxy
27. Basic Container based HAArchitecture
ü Need a distributed store to store configuration status
• Consul
• Zookeeper
• etcd
ü PostgreSQL Cluster Peer (Self Managing)
• Determines local instance status and updates configuration status
• Master regularly updates its status, failing which it is considered failed
• If master fails, election based on least lag and new leader takes over
• Other standby now follows the new master
• Potentially a third party can even provision the dead master as slave
28. Some New Trends in Container World
ü Binaries and data often separated
• One lives in Container image and other in Volumes
ü No longer pg_xlog deployed on separate volumes
• Underlying storage technologies leads to inconsistent point in time restore
causing DB to be unusable
ü No new table spaces
• Hard to get easy replication setups done on the fly
• Could lead to lost data if new tablespaces are not on volumes
ü Replications setup with automation rather than manually by Admins
29. Some New Trends in Container World
ü Adoption of Micro services
• Leading to lots of smaller databases for each micro service
ü Faster Updates
• Schema changes sometimes need to be backward compatible
ü Repeatable Deployments
• Need to redeploy at a moment’s notice
30. Deployment of PostgreSQL “Cluster”
ü Can be made self healing
ü Integrate with pg_rewind to reuse master as slave
ü Integrate with shared storage to leverage snapshot create new slaves
Virtual IP
Applications
Instance 1
Instance 2
Instance 3
Shared
Storage
31. But Wait I have multiple DB Servers
ü I need my clusters to dynamically grow (read scaling)
ü I also want things to auto-heal as much as it can
Applications
32. Kubernetes
ü Production grade container orchestrator
ü Horizontal scaling
• Setup rules to scale slaves
ü ConfigMap
• postgresql.conf
• pg_hba.conf
ü Secrets
• Username passwords
• Certificates
33. Kubernetes
ü Persistent Storage features evolving
• Plugins for storage drivers
ü External Services
• Services are accessible from all nodes
• Shared Storage plugins makes your Stateful containers also HA
• Powerful Combination along with PostgreSQL Replication
• can spin up fast slaves for multi-TB databases
35. But Wait .. Need to support Multi-Geo
ü It could be DR Strategy
ü It could be Compliance requirements
ü Service Discovery now getting complicated
Operations
Applications
36. Consul
• Service Discovery
• Failure Detection
• Multi Data Center
• DNS Query Interface
{
"service": {
"name": ”mypostgresql",
"tags": ["master"],
"address": "127.0.0.1",
"port": 5432,
"enableTagOverride": false,
}
}
nslookup master.mypostgresql.service.domain
nslookup mypostgresql.service.domain
37. Service Discovery
ü Uniform DNS name for your database
ü Cloud-agnostic naming
ü Certificates created using DNS names you own
ü No Single Point of Failures
Operations
Applications
38. PostgreSQL Enhancement
ü SRV Record of NameServer
• https://en.wikipedia.org/wiki/SRV_record
• IP:Port
ü PostgreSQL LIBPQ Client Enhancement
• Support Service Discovery using SRV Records
• servicename is passed
• libpq looks up the SRV Record from nameserver
• Connects port provided by SRV record
39. Summary
ü PostgreSQL “Cluster” deployments is the wave of change
ü Container is one of the technology but not the solution
40. Your Feedback is Important!
ü We’d like to understand your use of Postgres for HA / DR.
ü If interested,
ü Twitter: @jkshah
ü Email: jignesh@apporbit.com
41. Thanks.
Questions?
Follow me on twitter: @jkshah
Blog: http://jkshah.blogspot.com Full copies of your applications
at the push of a button
We are
HIRING !!!