SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Big Bad PostgreSQL: A Case Study


                 Moving a
                 “large,”
          “complicated,” and
           mission-critical
           datawarehouse
              from Oracle
            to PostgreSQL
           for cost control.



    1
Monday, August 2, 2010
Who am I? @postwait on twitter


                         Author of “Scalable Internet Architectures”
                         Pearson, ISBN: 067232699X

                         CEO of OmniTI
                         We build scalable and secure web applications

                         I am an Engineer
                         A practitioner of academic computing.
                         IEEE member and Senior ACM member.
                         On the Editorial Board of ACM’s Queue magazine.

                         I work on/with a lot of Open Source software:
                         Apache, perl, Linux, Solaris, PostgreSQL,
                         Varnish, Spread, Reconnoiter, etc.

                         I have experience.
                         I’ve had the unique opportunity to watch a great many catastrophes.
                         I enjoy immersing myself in the pathology of architecture failures.




Monday, August 2, 2010
Overall Architecture


                                                    Oracle 8i
                                                                                          OLTP instance:
                                          0.5 TB              0.25 TB
                                                                                          drives the site
                                          Hitachi              JBOD


                                                                  OLTP




       Log import and                                                                           Oracle 8i




       processing
                                                              Oracle 8i

                                                                                                    0.75 TB
                                                                                                     JBOD
                           MySQL
                         log importer               0.5 TB                1.5 TB
                                                    Hitachi                MTI              OLTP warm backup


                                                                                                               Warm spare
                               1.2 TB
                               SATA                             Datawarehouse
                               RAID


                           Log Importer


                                                                           MySQL 4.1




                                                                                 1.2 TB
                                                                               IDE RAID


                                                                            Data Exporter




                                 bulk selects / data exports
Monday, August 2, 2010
Database Situation

         •      The problems:
               •  The database is growing.
               •  The OLTP and ODS/warehouse are too slow.
               •  A lot of application code against the OLTP system.
               •  Minimal application code against the ODS system.
         •      Oracle:
               •  Licensed per processor.
               •  Really, really, really expensive on a large scale.
         •      PostgreSQL:
               •  No licensing costs.
               •  Good support for complex queries.


Monday, August 2, 2010
Database Choices



                  •      Must keep Oracle on OLTP
                    •      Complex, Oracle-specific web application.
                    •      Need more processors.
                  •      ODS: Oracle not required.
                    •      Complex queries from limited sources.
                    •      Needs more space and power.
                  •      Result:
                    •      Move ODS Oracle licenses to OLTP
                    •      Run PostgreSQL on ODS




Monday, August 2, 2010
PostgreSQL gotchas



                    •    For an OLTP system that does thousands of
                         updates per second, vacuuming is a hassle.

                    •    No upgrades?!

                         •   pg_migrator is coming along.

                    •    Less community experience with large
                         databases.

                    •    Replication features less evolved

                         •   though PostgreSQL 9.0 ups the ante



Monday, August 2, 2010
PostgreSQL ♥ ODS




                  •      Mostly inserts.

                  •      Updates/Deletes controlled, not real-time.

                  •      pl/perl (leverage DBI/DBD for remote
                         database connectivity).

                  •      Monster queries.

                  •      Extensible.




Monday, August 2, 2010
Choosing Linux



                    •    Popular, liked, good community support.

                    •    Chronic problems:

                         •   kernel panics

                         •   filesystems remounting read-only

                         •   filesystems don’t support snapshots

                         •   LVM is clunky on enterprise storage

                         •   20 outages in 4 months



Monday, August 2, 2010
Choosing Solaris 10

                    •    Switched to Solaris 10

                         •   No crashes, better system-level tools.

                             •   prstat, iostat, vmstat, smf, fault-
                                 management.

                         •   ZFS

                             •   snapshots (persistent), BLI backups.

                         •   Excellent support for enterprise storage.

                         •   DTrace.

                         •   Free (too).


Monday, August 2, 2010
Oracle features we need




                    •    Partitioning

                    •    Statistics and Aggregations

                         •   rank over partition, lead, lag, etc.

                    •    Large selects (100GB)

                    •    Autonomous transactions

                    •    Replication from Oracle (to Oracle)




Monday, August 2, 2010
Partitioning



                 For large data sets:
                   pgods=# select count(1) from ods.ods_tblpick_super;
                      count
                   ------------
                    3247286017
                   (1 row)




                    • Next biggest tables: billions
                    • Allows us to cluster data over specific ranges
                      (by date in our case)
                    • Simple, cheap archiving and removal of data.
                    • Can put ranges used less often in different
                         tablespaces (slower, cheaper storage)


Monday, August 2, 2010
Partitioning PostgreSQL style



           •       PostgreSQL doesn’t support partition...

           •       It supports inheritance... (what’s this?)

                 •       some crazy object-relation paradigm.

           •       We can use it to implement partitioning:

                 •       One master table with no rows.

                 •       Child tables that have our partition constraints.

                 •       Rules on the master table for insert/update/delete.



Monday, August 2, 2010
Partitioning PostgreSQL realized

             •       Cheaply add new empty partitions

             •       Cheaply remove old partitions

             •       Migrate less-often-accessed partitions to slower
                     storage

             •       Different indexes strategies per partition

             •       PostgreSQL >8.1 supports constraint checking on
                     inherited tables.
                   •     smarter planning

                   •     smarter executing



Monday, August 2, 2010
RANK OVER PARTITION



                    • In Oracle:
              select userid, email from (
              !   !    select u.userid, u.email,
              !   !    row_number() over
                               (partition by u.email order by userid desc) as position
              !   !    from (...)) where position = 1



                    • In PostgreSQL:
              FOR v_row IN select u.userid, u.email from (...) order by email, userid desc
              LOOP
              !    IF v_row.email != v_last_email THEN
              !    !   RETURN NEXT v_row;
              !    !   v_last_email := v_row.email;
              !    !   v_rownum := v_rownum + 1;
              !    END IF;
              END LOOP;

                                 With 8.4, we have windowing functions

Monday, August 2, 2010
Large SELECTs


                 • Application code does:
              select u.*, b.browser, m.lastmess
                from ods.ods_users u,
                     ods.ods_browsers b,
                     ( select userid, min(senddate) as senddate
                          from ods.ods_maillog
                      group by userid ) m,
                     ods.ods_maillog l
               where u.userid = b.userid
                 and u.userid = m.userid
                 and u.userid = l.userid
                 and l.senddate = m.senddate;




                    •    The width of these rows is about 2k

                    •    50 million row return set

                    •    > 100 GB of data

Monday, August 2, 2010
The Large SELECT Problem


                 •       libpq will buffer the entire result in memory.

                         •   This affects language bindings (DBD::Pg).

                         •   This is an utterly deficient default behavior.

                 •       This can be avoided by using cursors

                         •   Requires the app to be PostgreSQL specific.

                         •   You open a cursor.

                         •   Then FETCH the row count you desire.




Monday, August 2, 2010
Big SELECTs the Postgres way



           The previous “big” query becomes:
              DECLARE CURSOR bigdump FOR
              select u.*, b.browser, m.lastmess
                from ods.ods_users u,
                     ods.ods_browsers b,
                     ( select userid, min(senddate) as senddate
                          from ods.ods_maillog
                      group by userid ) m,
                     ods.ods_maillog l
               where u.userid = b.userid
                 and u.userid = m.userid
                 and u.userid = l.userid
                 and l.senddate = m.senddate;


           Then, in a loop:
              FETCH FORWARD 10000 FROM bigdump;




Monday, August 2, 2010
Autonomous Transactions



         • In Oracle we have over 2000 custom stored procedures.
         • During these procedures, we like to:
           • COMMIT incrementally
                    Useful for long transactions (update/delete) that
                    need not be atomic -- incremental COMMITs.

               • start a new top-level txn that can COMMIT
                    Useful for logging progress in a stored procedure so
                    that you know how far you progessed and how long
                    each step took even if it rolls back.



Monday, August 2, 2010
PostgreSQL shortcoming




                    •    PostgreSQL simply does not support
                         Autonomous transactions and to quote core
                         developers “that would be hard.”

                    •    When in doubt, use brute force.

                    •    Use pl/perl to use DBD::Pg to connect to
                         ourselves (a new backend) and execute a new
                         top-level transaction.




Monday, August 2, 2010
Replication


             • Cross vendor database replication isn’t too difficult.
             • Helps a lot when you can do it inside the database.
             • Using dbi-link (based on pl/perl and DBI) we can.
               • We can connect to any remote database.
               • INSERT into local tables directly from remote
                         SELECT statements.
                         [snapshots]
                   • LOOP over remote SELECT statements and
                         process them row-by-row.
                         [replaying remote DML logs]


Monday, August 2, 2010
Replication (really)



          •      Through a combination of snapshotting and DML
                 replay we:

                •        replicate over into over 2000 tables in PostgreSQL
                         from Oracle

                     •     snapshot replication of 200

                     •     DML replay logs for 1800

          •      PostgreSQL to Oracle is a bit harder

                •        out-of-band export and imports



Monday, August 2, 2010
New Architecture


            •      Master: Sun v890 and Hitachi AMS + warm standby
                     running Oracle
                     (1TB)

            •      Logs: several customs
                     running MySQL instances
                     (2TB each)

            •      ODS BI: 2x Sun v40
                     running PostgreSQL 8.3
                     (6TB on Sun JBODs on ZFS each)

            •      ODS archive: 2x custom
                     running PostgreSQL 8.3
                     (14TB internal storage on ZFS each)




Monday, August 2, 2010
PostgreSQL is Lacking




             •      pg_dump is too intrusive.

             •      Poor system-level instrumentation.

             •      Poor methods to determine specific contention.

             •      It relies on the operating system’s filesystem cache.
                    (which make PostgreSQL inconsistent across it’s
                    supported OS base)




Monday, August 2, 2010
Enter Solaris

        •       Solaris is a UNIX from Sun Microsystems.

        •       Is it different than other UNIX/UNIX-like systems?

              •          Mostly it isn’t different (hence the term UNIX)

              •          It does have extremely strong ABI backward
                         compatibility.

              •          It’s stable and works well on large machines.

        •       Solaris 10 shakes things up a bit:
              •          DTrace

              •          ZFS

              •          Zones


Monday, August 2, 2010
Solaris / ZFS


                    •    ZFS: Zettaback Filesystem.

                         •   264 snapshots, 248 files/directory, 264 bytes/filesystem,
                             278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system

                    •    Extremely cheap differential backups.

                         •   I have a 5 TB database, I need a backup!

                    •    No rollback in your database? What is this? MySQL?

                    •    No rollback in your filesystem?

                         •   ZFS has snapshots, rollback, clone and promote.

                         •   OMG! Life altering features.

                    •    Caveat: ZFS is slower than alternatives, by about 10% with tuning.




Monday, August 2, 2010
Solaris / Zones



                    •    Zones: Virtual Environments.

                    •    Shared kernel.

                    •    Can share filesystems.

                    •    Segregated processes and privileges.

                    •    No big deal for databases, right?


                                          But Wait!


Monday, August 2, 2010
Solaris / ZFS + Zones = Magic Juju
             https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh

    •       ZFS snapshot, clone, delegate to zone, boot and run.

    •       When done, halt zone, destroy clone.

    •       We get a point-in-time copy of our PostgreSQL database:

          •      read-write,

          •      low disk-space requirements,

          •      NO LOCKS! Welcome back pg_dump,
                 you don’t suck (as much) anymore.

          •      Fast snapshot to usable copy time:

                •        On our 20 GB database: 1 minute.

                •        On our 1.2 TB database: 2 minutes.

Monday, August 2, 2010
ZFS: how I saved my soul.

        •      Database crash. Bad. 1.2 TB of data... busted.
               The reason Robert Treat looks a bit older than he
               should.

        •      xlogs corrupted. catalog indexes corrupted.

        •      Fault? PostgreSQL bug? Bad memory? Who knows?

        •      Trial & error on a 1.2 TB data set is a cruel experience.

             •       In real-life, most recovery actions are destructive
                     actions.

             •       PostgreSQL is no different.

        •      Rollback to last checkpoint (ZFS), hack postgres code,
               try, fail, repeat.

Monday, August 2, 2010
Let DTrace open your eyes

       •       DTrace: Dynamic Tracing

       •       Dynamically instrument “stuff” in the system:
             •      system calls (like strace/truss/ktrace).

             •      process/scheduler activity (on/off cpu, semaphores, conditions).

             •      see signals sent and received.

             •      trace kernel functions, networking.

             •      watch I/O down to the disk.

             •      user-space processes, each function... each machine instruction!

             •      Add probes into apps where it makes sense to you.



Monday, August 2, 2010
Can you see what I see?

                    •    There is EXPLAIN... when that isn’t enough...

                    •    There is EXPLAIN ANALYZE... when that isn’t enough.

                    •    There is DTrace.

                         ; dtrace -q -n ‘
                         postgresql*:::statement-start
                         {
                            self->query = copyinstr(arg0);
                            self->ok=1;
                         }
                         io:::start
                         /self->ok/
                         {
                            @[self->query,
                              args[0]->b_flags & B_READ ? "read" : "write",
                              args[1]->dev_statname] = sum(args[0]->b_bcount);
                         }’
                         dtrace: description 'postgres*:::statement-start' matched 14 probes
                         ^C

                         select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000;
                             read sd1 16384
                         select division, sum(amount), avg(amount) from ods.billings where txn_timestamp
                         between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division;
                             read sd2 71647232




Monday, August 2, 2010
OmniTI Labs / pgtreats

                    •    https://labs.omniti.com/labs/pgtreats

                         •   Where we stick out PostgreSQL goodies...

                         •   like pg_file_stress


                         FILENAME/DBOBJECT                             READS                    WRITES
                                                             #   min    avg    max      #   min    avg   max
            alldata1__idx_remove_domain_external             1    12     12     12    398     0      0     0
            slowdata1__pg_rewrite                            1    12     12     12      0     0      0     0
            slowdata1__pg_class_oid_index                    1     0      0      0      0     0      0     0
            slowdata1__pg_attribute                          2     0      0      0      0     0      0     0
            alldata1__mv_users                               0     0      0      0      4     0      0     0
            slowdata1__pg_statistic                          1     0      0      0      0     0      0     0
            slowdata1__pg_index                              1     0      0      0      0     0      0     0
            slowdata1__pg_index_indexrelid_index             1     0      0      0      0     0      0     0
            alldata1__remove_domain_external                 0     0      0      0    502     0      0     0
            alldata1__promo_15_tb_full_2                    19     0      0      0     11     0      0     0
            slowdata1__pg_class_relname_nsp_index            2     0      0      0      0     0      0     0
            alldata1__promo_177intaoltest_tb                 0     0      0      0   1053     0      0     0
            slowdata1__pg_attribute_relid_attnum_index       2     0      0      0      0     0      0     0
            alldata1__promo_15_tb_full_2_pk                  2     0      0      0      0     0      0     0
            alldata1__all_mailable_2                      1403     0      0    423      0     0      0     0
            alldata1__mv_users_pkey                          0     0      0      0      4     0      0     0




Monday, August 2, 2010
Results




                    •    Move ODS Oracle licenses to OLTP

                    •    Run PostgreSQL on ODS

                    •    Save $800k in license costs.

                    •    Spend $100k in labor costs.

                    •    Learn a lot.




Monday, August 2, 2010
Thanks!



                    •    Thank you.

                    •    http://omniti.com/does/postgresql

                    •    We’re hiring, but only if you love:

                         •   lots of data on lots of disks on lots of big boxes

                         •   smart people

                         •   hard problems

                         •   more than one database technology (including PostgreSQL)

                         •   responsibility




Monday, August 2, 2010

Más contenido relacionado

Destacado

Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileIgnite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileJoshua L. Davis
 
The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0Joshua L. Davis
 
The Open Source Movement
The Open Source MovementThe Open Source Movement
The Open Source MovementJoshua L. Davis
 
Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Joshua L. Davis
 
Ignite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyIgnite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyJoshua L. Davis
 
OSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICOSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICJoshua L. Davis
 
Senior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesSenior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesJoshua L. Davis
 
The Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsThe Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsJoshua L. Davis
 
Barcamp: Open Source and Security
Barcamp: Open Source and SecurityBarcamp: Open Source and Security
Barcamp: Open Source and SecurityJoshua L. Davis
 
Innovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsInnovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsJoshua L. Davis
 
Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Joshua L. Davis
 
Mil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionMil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionJoshua L. Davis
 
DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)Joshua L. Davis
 

Destacado (14)

Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileIgnite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
 
Ignite: YSANAOYOA
Ignite: YSANAOYOAIgnite: YSANAOYOA
Ignite: YSANAOYOA
 
The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0
 
The Open Source Movement
The Open Source MovementThe Open Source Movement
The Open Source Movement
 
Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)
 
Ignite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyIgnite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with Ruby
 
OSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICOSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/IC
 
Senior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesSenior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social Technologies
 
The Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsThe Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging Threats
 
Barcamp: Open Source and Security
Barcamp: Open Source and SecurityBarcamp: Open Source and Security
Barcamp: Open Source and Security
 
Innovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsInnovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source Solutions
 
Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...
 
Mil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionMil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC Convention
 
DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)
 

Similar a Big Bad PostgreSQL: BI on a Budget

Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGuillaume Laforge
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012patmisasi
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
mogpres
mogpresmogpres
mogpresxlight
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014Connor McDonald
 
MySQL overview
MySQL overviewMySQL overview
MySQL overviewMarco Tusa
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Entel
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...Ernie Souhrada
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUGFuad Arshad
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBAlluxio, Inc.
 
Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Alluxio, Inc.
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software BenchmarkAkira Shibata
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle OverviewPeter Doolan
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationTim Callaghan
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceIsaac Christoffersen
 
Sanger HPC infrastructure Report (2007)
Sanger HPC infrastructure  Report (2007)Sanger HPC infrastructure  Report (2007)
Sanger HPC infrastructure Report (2007)Guy Coates
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 

Similar a Big Bad PostgreSQL: BI on a Budget (20)

Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
mogpres
mogpresmogpres
mogpres
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014
 
MySQL overview
MySQL overviewMySQL overview
MySQL overview
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUG
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDB
 
Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software Benchmark
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle Overview
 
mogpres
mogpresmogpres
mogpres
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free Replication
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
 
Sanger HPC infrastructure Report (2007)
Sanger HPC infrastructure  Report (2007)Sanger HPC infrastructure  Report (2007)
Sanger HPC infrastructure Report (2007)
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 

Más de Joshua L. Davis

Ignite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareIgnite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareJoshua L. Davis
 
Using the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentUsing the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentJoshua L. Davis
 
Open Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityOpen Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityJoshua L. Davis
 
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesImportance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesJoshua L. Davis
 
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSOZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSJoshua L. Davis
 
Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Joshua L. Davis
 
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerReaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerJoshua L. Davis
 
USIP Open Simulation Platform
USIP Open Simulation PlatformUSIP Open Simulation Platform
USIP Open Simulation PlatformJoshua L. Davis
 
CONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthCONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthJoshua L. Davis
 
CompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteCompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteJoshua L. Davis
 
.org to .com: Going from Project to Product
.org to .com: Going from Project to Product.org to .com: Going from Project to Product
.org to .com: Going from Project to ProductJoshua L. Davis
 
Cyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureCyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureJoshua L. Davis
 
An Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerAn Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerJoshua L. Davis
 
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosHow and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosJoshua L. Davis
 

Más de Joshua L. Davis (16)

Ignite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareIgnite: Devops - Why Should You Care
Ignite: Devops - Why Should You Care
 
Using the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentUsing the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting Environment
 
Open Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityOpen Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and Security
 
SOSCOE Overview
SOSCOE OverviewSOSCOE Overview
SOSCOE Overview
 
milSuite
milSuitemilSuite
milSuite
 
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesImportance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
 
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSOZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
 
Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"
 
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerReaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
 
USIP Open Simulation Platform
USIP Open Simulation PlatformUSIP Open Simulation Platform
USIP Open Simulation Platform
 
CONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthCONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military Health
 
CompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteCompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuite
 
.org to .com: Going from Project to Product
.org to .com: Going from Project to Product.org to .com: Going from Project to Product
.org to .com: Going from Project to Product
 
Cyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureCyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical Culture
 
An Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerAn Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL Server
 
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosHow and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Big Bad PostgreSQL: BI on a Budget

  • 1. Big Bad PostgreSQL: A Case Study Moving a “large,” “complicated,” and mission-critical datawarehouse from Oracle to PostgreSQL for cost control. 1 Monday, August 2, 2010
  • 2. Who am I? @postwait on twitter Author of “Scalable Internet Architectures” Pearson, ISBN: 067232699X CEO of OmniTI We build scalable and secure web applications I am an Engineer A practitioner of academic computing. IEEE member and Senior ACM member. On the Editorial Board of ACM’s Queue magazine. I work on/with a lot of Open Source software: Apache, perl, Linux, Solaris, PostgreSQL, Varnish, Spread, Reconnoiter, etc. I have experience. I’ve had the unique opportunity to watch a great many catastrophes. I enjoy immersing myself in the pathology of architecture failures. Monday, August 2, 2010
  • 3. Overall Architecture Oracle 8i OLTP instance: 0.5 TB 0.25 TB drives the site Hitachi JBOD OLTP Log import and Oracle 8i processing Oracle 8i 0.75 TB JBOD MySQL log importer 0.5 TB 1.5 TB Hitachi MTI OLTP warm backup Warm spare 1.2 TB SATA Datawarehouse RAID Log Importer MySQL 4.1 1.2 TB IDE RAID Data Exporter bulk selects / data exports Monday, August 2, 2010
  • 4. Database Situation • The problems: • The database is growing. • The OLTP and ODS/warehouse are too slow. • A lot of application code against the OLTP system. • Minimal application code against the ODS system. • Oracle: • Licensed per processor. • Really, really, really expensive on a large scale. • PostgreSQL: • No licensing costs. • Good support for complex queries. Monday, August 2, 2010
  • 5. Database Choices • Must keep Oracle on OLTP • Complex, Oracle-specific web application. • Need more processors. • ODS: Oracle not required. • Complex queries from limited sources. • Needs more space and power. • Result: • Move ODS Oracle licenses to OLTP • Run PostgreSQL on ODS Monday, August 2, 2010
  • 6. PostgreSQL gotchas • For an OLTP system that does thousands of updates per second, vacuuming is a hassle. • No upgrades?! • pg_migrator is coming along. • Less community experience with large databases. • Replication features less evolved • though PostgreSQL 9.0 ups the ante Monday, August 2, 2010
  • 7. PostgreSQL ♥ ODS • Mostly inserts. • Updates/Deletes controlled, not real-time. • pl/perl (leverage DBI/DBD for remote database connectivity). • Monster queries. • Extensible. Monday, August 2, 2010
  • 8. Choosing Linux • Popular, liked, good community support. • Chronic problems: • kernel panics • filesystems remounting read-only • filesystems don’t support snapshots • LVM is clunky on enterprise storage • 20 outages in 4 months Monday, August 2, 2010
  • 9. Choosing Solaris 10 • Switched to Solaris 10 • No crashes, better system-level tools. • prstat, iostat, vmstat, smf, fault- management. • ZFS • snapshots (persistent), BLI backups. • Excellent support for enterprise storage. • DTrace. • Free (too). Monday, August 2, 2010
  • 10. Oracle features we need • Partitioning • Statistics and Aggregations • rank over partition, lead, lag, etc. • Large selects (100GB) • Autonomous transactions • Replication from Oracle (to Oracle) Monday, August 2, 2010
  • 11. Partitioning For large data sets: pgods=# select count(1) from ods.ods_tblpick_super; count ------------ 3247286017 (1 row) • Next biggest tables: billions • Allows us to cluster data over specific ranges (by date in our case) • Simple, cheap archiving and removal of data. • Can put ranges used less often in different tablespaces (slower, cheaper storage) Monday, August 2, 2010
  • 12. Partitioning PostgreSQL style • PostgreSQL doesn’t support partition... • It supports inheritance... (what’s this?) • some crazy object-relation paradigm. • We can use it to implement partitioning: • One master table with no rows. • Child tables that have our partition constraints. • Rules on the master table for insert/update/delete. Monday, August 2, 2010
  • 13. Partitioning PostgreSQL realized • Cheaply add new empty partitions • Cheaply remove old partitions • Migrate less-often-accessed partitions to slower storage • Different indexes strategies per partition • PostgreSQL >8.1 supports constraint checking on inherited tables. • smarter planning • smarter executing Monday, August 2, 2010
  • 14. RANK OVER PARTITION • In Oracle: select userid, email from ( ! ! select u.userid, u.email, ! ! row_number() over (partition by u.email order by userid desc) as position ! ! from (...)) where position = 1 • In PostgreSQL: FOR v_row IN select u.userid, u.email from (...) order by email, userid desc LOOP ! IF v_row.email != v_last_email THEN ! ! RETURN NEXT v_row; ! ! v_last_email := v_row.email; ! ! v_rownum := v_rownum + 1; ! END IF; END LOOP; With 8.4, we have windowing functions Monday, August 2, 2010
  • 15. Large SELECTs • Application code does: select u.*, b.browser, m.lastmess from ods.ods_users u, ods.ods_browsers b, ( select userid, min(senddate) as senddate from ods.ods_maillog group by userid ) m, ods.ods_maillog l where u.userid = b.userid and u.userid = m.userid and u.userid = l.userid and l.senddate = m.senddate; • The width of these rows is about 2k • 50 million row return set • > 100 GB of data Monday, August 2, 2010
  • 16. The Large SELECT Problem • libpq will buffer the entire result in memory. • This affects language bindings (DBD::Pg). • This is an utterly deficient default behavior. • This can be avoided by using cursors • Requires the app to be PostgreSQL specific. • You open a cursor. • Then FETCH the row count you desire. Monday, August 2, 2010
  • 17. Big SELECTs the Postgres way The previous “big” query becomes: DECLARE CURSOR bigdump FOR select u.*, b.browser, m.lastmess from ods.ods_users u, ods.ods_browsers b, ( select userid, min(senddate) as senddate from ods.ods_maillog group by userid ) m, ods.ods_maillog l where u.userid = b.userid and u.userid = m.userid and u.userid = l.userid and l.senddate = m.senddate; Then, in a loop: FETCH FORWARD 10000 FROM bigdump; Monday, August 2, 2010
  • 18. Autonomous Transactions • In Oracle we have over 2000 custom stored procedures. • During these procedures, we like to: • COMMIT incrementally Useful for long transactions (update/delete) that need not be atomic -- incremental COMMITs. • start a new top-level txn that can COMMIT Useful for logging progress in a stored procedure so that you know how far you progessed and how long each step took even if it rolls back. Monday, August 2, 2010
  • 19. PostgreSQL shortcoming • PostgreSQL simply does not support Autonomous transactions and to quote core developers “that would be hard.” • When in doubt, use brute force. • Use pl/perl to use DBD::Pg to connect to ourselves (a new backend) and execute a new top-level transaction. Monday, August 2, 2010
  • 20. Replication • Cross vendor database replication isn’t too difficult. • Helps a lot when you can do it inside the database. • Using dbi-link (based on pl/perl and DBI) we can. • We can connect to any remote database. • INSERT into local tables directly from remote SELECT statements. [snapshots] • LOOP over remote SELECT statements and process them row-by-row. [replaying remote DML logs] Monday, August 2, 2010
  • 21. Replication (really) • Through a combination of snapshotting and DML replay we: • replicate over into over 2000 tables in PostgreSQL from Oracle • snapshot replication of 200 • DML replay logs for 1800 • PostgreSQL to Oracle is a bit harder • out-of-band export and imports Monday, August 2, 2010
  • 22. New Architecture • Master: Sun v890 and Hitachi AMS + warm standby running Oracle (1TB) • Logs: several customs running MySQL instances (2TB each) • ODS BI: 2x Sun v40 running PostgreSQL 8.3 (6TB on Sun JBODs on ZFS each) • ODS archive: 2x custom running PostgreSQL 8.3 (14TB internal storage on ZFS each) Monday, August 2, 2010
  • 23. PostgreSQL is Lacking • pg_dump is too intrusive. • Poor system-level instrumentation. • Poor methods to determine specific contention. • It relies on the operating system’s filesystem cache. (which make PostgreSQL inconsistent across it’s supported OS base) Monday, August 2, 2010
  • 24. Enter Solaris • Solaris is a UNIX from Sun Microsystems. • Is it different than other UNIX/UNIX-like systems? • Mostly it isn’t different (hence the term UNIX) • It does have extremely strong ABI backward compatibility. • It’s stable and works well on large machines. • Solaris 10 shakes things up a bit: • DTrace • ZFS • Zones Monday, August 2, 2010
  • 25. Solaris / ZFS • ZFS: Zettaback Filesystem. • 264 snapshots, 248 files/directory, 264 bytes/filesystem, 278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system • Extremely cheap differential backups. • I have a 5 TB database, I need a backup! • No rollback in your database? What is this? MySQL? • No rollback in your filesystem? • ZFS has snapshots, rollback, clone and promote. • OMG! Life altering features. • Caveat: ZFS is slower than alternatives, by about 10% with tuning. Monday, August 2, 2010
  • 26. Solaris / Zones • Zones: Virtual Environments. • Shared kernel. • Can share filesystems. • Segregated processes and privileges. • No big deal for databases, right? But Wait! Monday, August 2, 2010
  • 27. Solaris / ZFS + Zones = Magic Juju https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh • ZFS snapshot, clone, delegate to zone, boot and run. • When done, halt zone, destroy clone. • We get a point-in-time copy of our PostgreSQL database: • read-write, • low disk-space requirements, • NO LOCKS! Welcome back pg_dump, you don’t suck (as much) anymore. • Fast snapshot to usable copy time: • On our 20 GB database: 1 minute. • On our 1.2 TB database: 2 minutes. Monday, August 2, 2010
  • 28. ZFS: how I saved my soul. • Database crash. Bad. 1.2 TB of data... busted. The reason Robert Treat looks a bit older than he should. • xlogs corrupted. catalog indexes corrupted. • Fault? PostgreSQL bug? Bad memory? Who knows? • Trial & error on a 1.2 TB data set is a cruel experience. • In real-life, most recovery actions are destructive actions. • PostgreSQL is no different. • Rollback to last checkpoint (ZFS), hack postgres code, try, fail, repeat. Monday, August 2, 2010
  • 29. Let DTrace open your eyes • DTrace: Dynamic Tracing • Dynamically instrument “stuff” in the system: • system calls (like strace/truss/ktrace). • process/scheduler activity (on/off cpu, semaphores, conditions). • see signals sent and received. • trace kernel functions, networking. • watch I/O down to the disk. • user-space processes, each function... each machine instruction! • Add probes into apps where it makes sense to you. Monday, August 2, 2010
  • 30. Can you see what I see? • There is EXPLAIN... when that isn’t enough... • There is EXPLAIN ANALYZE... when that isn’t enough. • There is DTrace. ; dtrace -q -n ‘ postgresql*:::statement-start { self->query = copyinstr(arg0); self->ok=1; } io:::start /self->ok/ { @[self->query, args[0]->b_flags & B_READ ? "read" : "write", args[1]->dev_statname] = sum(args[0]->b_bcount); }’ dtrace: description 'postgres*:::statement-start' matched 14 probes ^C select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000; read sd1 16384 select division, sum(amount), avg(amount) from ods.billings where txn_timestamp between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division; read sd2 71647232 Monday, August 2, 2010
  • 31. OmniTI Labs / pgtreats • https://labs.omniti.com/labs/pgtreats • Where we stick out PostgreSQL goodies... • like pg_file_stress FILENAME/DBOBJECT READS WRITES # min avg max # min avg max alldata1__idx_remove_domain_external 1 12 12 12 398 0 0 0 slowdata1__pg_rewrite 1 12 12 12 0 0 0 0 slowdata1__pg_class_oid_index 1 0 0 0 0 0 0 0 slowdata1__pg_attribute 2 0 0 0 0 0 0 0 alldata1__mv_users 0 0 0 0 4 0 0 0 slowdata1__pg_statistic 1 0 0 0 0 0 0 0 slowdata1__pg_index 1 0 0 0 0 0 0 0 slowdata1__pg_index_indexrelid_index 1 0 0 0 0 0 0 0 alldata1__remove_domain_external 0 0 0 0 502 0 0 0 alldata1__promo_15_tb_full_2 19 0 0 0 11 0 0 0 slowdata1__pg_class_relname_nsp_index 2 0 0 0 0 0 0 0 alldata1__promo_177intaoltest_tb 0 0 0 0 1053 0 0 0 slowdata1__pg_attribute_relid_attnum_index 2 0 0 0 0 0 0 0 alldata1__promo_15_tb_full_2_pk 2 0 0 0 0 0 0 0 alldata1__all_mailable_2 1403 0 0 423 0 0 0 0 alldata1__mv_users_pkey 0 0 0 0 4 0 0 0 Monday, August 2, 2010
  • 32. Results • Move ODS Oracle licenses to OLTP • Run PostgreSQL on ODS • Save $800k in license costs. • Spend $100k in labor costs. • Learn a lot. Monday, August 2, 2010
  • 33. Thanks! • Thank you. • http://omniti.com/does/postgresql • We’re hiring, but only if you love: • lots of data on lots of disks on lots of big boxes • smart people • hard problems • more than one database technology (including PostgreSQL) • responsibility Monday, August 2, 2010