SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Big Bad PostgreSQL: A Case Study


                 Moving a
                 “large,”
          “complicated,” and
           mission-critical
           datawarehouse
              from Oracle
            to PostgreSQL
           for cost control.



    1
Monday, August 2, 2010
Who am I? @postwait on twitter


                         Author of “Scalable Internet Architectures”
                         Pearson, ISBN: 067232699X

                         CEO of OmniTI
                         We build scalable and secure web applications

                         I am an Engineer
                         A practitioner of academic computing.
                         IEEE member and Senior ACM member.
                         On the Editorial Board of ACM’s Queue magazine.

                         I work on/with a lot of Open Source software:
                         Apache, perl, Linux, Solaris, PostgreSQL,
                         Varnish, Spread, Reconnoiter, etc.

                         I have experience.
                         I’ve had the unique opportunity to watch a great many catastrophes.
                         I enjoy immersing myself in the pathology of architecture failures.




Monday, August 2, 2010
Overall Architecture


                                                    Oracle 8i
                                                                                          OLTP instance:
                                          0.5 TB              0.25 TB
                                                                                          drives the site
                                          Hitachi              JBOD


                                                                  OLTP




       Log import and                                                                           Oracle 8i




       processing
                                                              Oracle 8i

                                                                                                    0.75 TB
                                                                                                     JBOD
                           MySQL
                         log importer               0.5 TB                1.5 TB
                                                    Hitachi                MTI              OLTP warm backup


                                                                                                               Warm spare
                               1.2 TB
                               SATA                             Datawarehouse
                               RAID


                           Log Importer


                                                                           MySQL 4.1




                                                                                 1.2 TB
                                                                               IDE RAID


                                                                            Data Exporter




                                 bulk selects / data exports
Monday, August 2, 2010
Database Situation

         •      The problems:
               •  The database is growing.
               •  The OLTP and ODS/warehouse are too slow.
               •  A lot of application code against the OLTP system.
               •  Minimal application code against the ODS system.
         •      Oracle:
               •  Licensed per processor.
               •  Really, really, really expensive on a large scale.
         •      PostgreSQL:
               •  No licensing costs.
               •  Good support for complex queries.


Monday, August 2, 2010
Database Choices



                  •      Must keep Oracle on OLTP
                    •      Complex, Oracle-specific web application.
                    •      Need more processors.
                  •      ODS: Oracle not required.
                    •      Complex queries from limited sources.
                    •      Needs more space and power.
                  •      Result:
                    •      Move ODS Oracle licenses to OLTP
                    •      Run PostgreSQL on ODS




Monday, August 2, 2010
PostgreSQL gotchas



                    •    For an OLTP system that does thousands of
                         updates per second, vacuuming is a hassle.

                    •    No upgrades?!

                         •   pg_migrator is coming along.

                    •    Less community experience with large
                         databases.

                    •    Replication features less evolved

                         •   though PostgreSQL 9.0 ups the ante



Monday, August 2, 2010
PostgreSQL ♥ ODS




                  •      Mostly inserts.

                  •      Updates/Deletes controlled, not real-time.

                  •      pl/perl (leverage DBI/DBD for remote
                         database connectivity).

                  •      Monster queries.

                  •      Extensible.




Monday, August 2, 2010
Choosing Linux



                    •    Popular, liked, good community support.

                    •    Chronic problems:

                         •   kernel panics

                         •   filesystems remounting read-only

                         •   filesystems don’t support snapshots

                         •   LVM is clunky on enterprise storage

                         •   20 outages in 4 months



Monday, August 2, 2010
Choosing Solaris 10

                    •    Switched to Solaris 10

                         •   No crashes, better system-level tools.

                             •   prstat, iostat, vmstat, smf, fault-
                                 management.

                         •   ZFS

                             •   snapshots (persistent), BLI backups.

                         •   Excellent support for enterprise storage.

                         •   DTrace.

                         •   Free (too).


Monday, August 2, 2010
Oracle features we need




                    •    Partitioning

                    •    Statistics and Aggregations

                         •   rank over partition, lead, lag, etc.

                    •    Large selects (100GB)

                    •    Autonomous transactions

                    •    Replication from Oracle (to Oracle)




Monday, August 2, 2010
Partitioning



                 For large data sets:
                   pgods=# select count(1) from ods.ods_tblpick_super;
                      count
                   ------------
                    3247286017
                   (1 row)




                    • Next biggest tables: billions
                    • Allows us to cluster data over specific ranges
                      (by date in our case)
                    • Simple, cheap archiving and removal of data.
                    • Can put ranges used less often in different
                         tablespaces (slower, cheaper storage)


Monday, August 2, 2010
Partitioning PostgreSQL style



           •       PostgreSQL doesn’t support partition...

           •       It supports inheritance... (what’s this?)

                 •       some crazy object-relation paradigm.

           •       We can use it to implement partitioning:

                 •       One master table with no rows.

                 •       Child tables that have our partition constraints.

                 •       Rules on the master table for insert/update/delete.



Monday, August 2, 2010
Partitioning PostgreSQL realized

             •       Cheaply add new empty partitions

             •       Cheaply remove old partitions

             •       Migrate less-often-accessed partitions to slower
                     storage

             •       Different indexes strategies per partition

             •       PostgreSQL >8.1 supports constraint checking on
                     inherited tables.
                   •     smarter planning

                   •     smarter executing



Monday, August 2, 2010
RANK OVER PARTITION



                    • In Oracle:
              select userid, email from (
              !   !    select u.userid, u.email,
              !   !    row_number() over
                               (partition by u.email order by userid desc) as position
              !   !    from (...)) where position = 1



                    • In PostgreSQL:
              FOR v_row IN select u.userid, u.email from (...) order by email, userid desc
              LOOP
              !    IF v_row.email != v_last_email THEN
              !    !   RETURN NEXT v_row;
              !    !   v_last_email := v_row.email;
              !    !   v_rownum := v_rownum + 1;
              !    END IF;
              END LOOP;

                                 With 8.4, we have windowing functions

Monday, August 2, 2010
Large SELECTs


                 • Application code does:
              select u.*, b.browser, m.lastmess
                from ods.ods_users u,
                     ods.ods_browsers b,
                     ( select userid, min(senddate) as senddate
                          from ods.ods_maillog
                      group by userid ) m,
                     ods.ods_maillog l
               where u.userid = b.userid
                 and u.userid = m.userid
                 and u.userid = l.userid
                 and l.senddate = m.senddate;




                    •    The width of these rows is about 2k

                    •    50 million row return set

                    •    > 100 GB of data

Monday, August 2, 2010
The Large SELECT Problem


                 •       libpq will buffer the entire result in memory.

                         •   This affects language bindings (DBD::Pg).

                         •   This is an utterly deficient default behavior.

                 •       This can be avoided by using cursors

                         •   Requires the app to be PostgreSQL specific.

                         •   You open a cursor.

                         •   Then FETCH the row count you desire.




Monday, August 2, 2010
Big SELECTs the Postgres way



           The previous “big” query becomes:
              DECLARE CURSOR bigdump FOR
              select u.*, b.browser, m.lastmess
                from ods.ods_users u,
                     ods.ods_browsers b,
                     ( select userid, min(senddate) as senddate
                          from ods.ods_maillog
                      group by userid ) m,
                     ods.ods_maillog l
               where u.userid = b.userid
                 and u.userid = m.userid
                 and u.userid = l.userid
                 and l.senddate = m.senddate;


           Then, in a loop:
              FETCH FORWARD 10000 FROM bigdump;




Monday, August 2, 2010
Autonomous Transactions



         • In Oracle we have over 2000 custom stored procedures.
         • During these procedures, we like to:
           • COMMIT incrementally
                    Useful for long transactions (update/delete) that
                    need not be atomic -- incremental COMMITs.

               • start a new top-level txn that can COMMIT
                    Useful for logging progress in a stored procedure so
                    that you know how far you progessed and how long
                    each step took even if it rolls back.



Monday, August 2, 2010
PostgreSQL shortcoming




                    •    PostgreSQL simply does not support
                         Autonomous transactions and to quote core
                         developers “that would be hard.”

                    •    When in doubt, use brute force.

                    •    Use pl/perl to use DBD::Pg to connect to
                         ourselves (a new backend) and execute a new
                         top-level transaction.




Monday, August 2, 2010
Replication


             • Cross vendor database replication isn’t too difficult.
             • Helps a lot when you can do it inside the database.
             • Using dbi-link (based on pl/perl and DBI) we can.
               • We can connect to any remote database.
               • INSERT into local tables directly from remote
                         SELECT statements.
                         [snapshots]
                   • LOOP over remote SELECT statements and
                         process them row-by-row.
                         [replaying remote DML logs]


Monday, August 2, 2010
Replication (really)



          •      Through a combination of snapshotting and DML
                 replay we:

                •        replicate over into over 2000 tables in PostgreSQL
                         from Oracle

                     •     snapshot replication of 200

                     •     DML replay logs for 1800

          •      PostgreSQL to Oracle is a bit harder

                •        out-of-band export and imports



Monday, August 2, 2010
New Architecture


            •      Master: Sun v890 and Hitachi AMS + warm standby
                     running Oracle
                     (1TB)

            •      Logs: several customs
                     running MySQL instances
                     (2TB each)

            •      ODS BI: 2x Sun v40
                     running PostgreSQL 8.3
                     (6TB on Sun JBODs on ZFS each)

            •      ODS archive: 2x custom
                     running PostgreSQL 8.3
                     (14TB internal storage on ZFS each)




Monday, August 2, 2010
PostgreSQL is Lacking




             •      pg_dump is too intrusive.

             •      Poor system-level instrumentation.

             •      Poor methods to determine specific contention.

             •      It relies on the operating system’s filesystem cache.
                    (which make PostgreSQL inconsistent across it’s
                    supported OS base)




Monday, August 2, 2010
Enter Solaris

        •       Solaris is a UNIX from Sun Microsystems.

        •       Is it different than other UNIX/UNIX-like systems?

              •          Mostly it isn’t different (hence the term UNIX)

              •          It does have extremely strong ABI backward
                         compatibility.

              •          It’s stable and works well on large machines.

        •       Solaris 10 shakes things up a bit:
              •          DTrace

              •          ZFS

              •          Zones


Monday, August 2, 2010
Solaris / ZFS


                    •    ZFS: Zettaback Filesystem.

                         •   264 snapshots, 248 files/directory, 264 bytes/filesystem,
                             278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system

                    •    Extremely cheap differential backups.

                         •   I have a 5 TB database, I need a backup!

                    •    No rollback in your database? What is this? MySQL?

                    •    No rollback in your filesystem?

                         •   ZFS has snapshots, rollback, clone and promote.

                         •   OMG! Life altering features.

                    •    Caveat: ZFS is slower than alternatives, by about 10% with tuning.




Monday, August 2, 2010
Solaris / Zones



                    •    Zones: Virtual Environments.

                    •    Shared kernel.

                    •    Can share filesystems.

                    •    Segregated processes and privileges.

                    •    No big deal for databases, right?


                                          But Wait!


Monday, August 2, 2010
Solaris / ZFS + Zones = Magic Juju
             https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh

    •       ZFS snapshot, clone, delegate to zone, boot and run.

    •       When done, halt zone, destroy clone.

    •       We get a point-in-time copy of our PostgreSQL database:

          •      read-write,

          •      low disk-space requirements,

          •      NO LOCKS! Welcome back pg_dump,
                 you don’t suck (as much) anymore.

          •      Fast snapshot to usable copy time:

                •        On our 20 GB database: 1 minute.

                •        On our 1.2 TB database: 2 minutes.

Monday, August 2, 2010
ZFS: how I saved my soul.

        •      Database crash. Bad. 1.2 TB of data... busted.
               The reason Robert Treat looks a bit older than he
               should.

        •      xlogs corrupted. catalog indexes corrupted.

        •      Fault? PostgreSQL bug? Bad memory? Who knows?

        •      Trial & error on a 1.2 TB data set is a cruel experience.

             •       In real-life, most recovery actions are destructive
                     actions.

             •       PostgreSQL is no different.

        •      Rollback to last checkpoint (ZFS), hack postgres code,
               try, fail, repeat.

Monday, August 2, 2010
Let DTrace open your eyes

       •       DTrace: Dynamic Tracing

       •       Dynamically instrument “stuff” in the system:
             •      system calls (like strace/truss/ktrace).

             •      process/scheduler activity (on/off cpu, semaphores, conditions).

             •      see signals sent and received.

             •      trace kernel functions, networking.

             •      watch I/O down to the disk.

             •      user-space processes, each function... each machine instruction!

             •      Add probes into apps where it makes sense to you.



Monday, August 2, 2010
Can you see what I see?

                    •    There is EXPLAIN... when that isn’t enough...

                    •    There is EXPLAIN ANALYZE... when that isn’t enough.

                    •    There is DTrace.

                         ; dtrace -q -n ‘
                         postgresql*:::statement-start
                         {
                            self->query = copyinstr(arg0);
                            self->ok=1;
                         }
                         io:::start
                         /self->ok/
                         {
                            @[self->query,
                              args[0]->b_flags & B_READ ? "read" : "write",
                              args[1]->dev_statname] = sum(args[0]->b_bcount);
                         }’
                         dtrace: description 'postgres*:::statement-start' matched 14 probes
                         ^C

                         select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000;
                             read sd1 16384
                         select division, sum(amount), avg(amount) from ods.billings where txn_timestamp
                         between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division;
                             read sd2 71647232




Monday, August 2, 2010
OmniTI Labs / pgtreats

                    •    https://labs.omniti.com/labs/pgtreats

                         •   Where we stick out PostgreSQL goodies...

                         •   like pg_file_stress


                         FILENAME/DBOBJECT                             READS                    WRITES
                                                             #   min    avg    max      #   min    avg   max
            alldata1__idx_remove_domain_external             1    12     12     12    398     0      0     0
            slowdata1__pg_rewrite                            1    12     12     12      0     0      0     0
            slowdata1__pg_class_oid_index                    1     0      0      0      0     0      0     0
            slowdata1__pg_attribute                          2     0      0      0      0     0      0     0
            alldata1__mv_users                               0     0      0      0      4     0      0     0
            slowdata1__pg_statistic                          1     0      0      0      0     0      0     0
            slowdata1__pg_index                              1     0      0      0      0     0      0     0
            slowdata1__pg_index_indexrelid_index             1     0      0      0      0     0      0     0
            alldata1__remove_domain_external                 0     0      0      0    502     0      0     0
            alldata1__promo_15_tb_full_2                    19     0      0      0     11     0      0     0
            slowdata1__pg_class_relname_nsp_index            2     0      0      0      0     0      0     0
            alldata1__promo_177intaoltest_tb                 0     0      0      0   1053     0      0     0
            slowdata1__pg_attribute_relid_attnum_index       2     0      0      0      0     0      0     0
            alldata1__promo_15_tb_full_2_pk                  2     0      0      0      0     0      0     0
            alldata1__all_mailable_2                      1403     0      0    423      0     0      0     0
            alldata1__mv_users_pkey                          0     0      0      0      4     0      0     0




Monday, August 2, 2010
Results




                    •    Move ODS Oracle licenses to OLTP

                    •    Run PostgreSQL on ODS

                    •    Save $800k in license costs.

                    •    Spend $100k in labor costs.

                    •    Learn a lot.




Monday, August 2, 2010
Thanks!



                    •    Thank you.

                    •    http://omniti.com/does/postgresql

                    •    We’re hiring, but only if you love:

                         •   lots of data on lots of disks on lots of big boxes

                         •   smart people

                         •   hard problems

                         •   more than one database technology (including PostgreSQL)

                         •   responsibility




Monday, August 2, 2010

Más contenido relacionado

Destacado

Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileIgnite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileJoshua L. Davis
 
The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0Joshua L. Davis
 
The Open Source Movement
The Open Source MovementThe Open Source Movement
The Open Source MovementJoshua L. Davis
 
Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Joshua L. Davis
 
Ignite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyIgnite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyJoshua L. Davis
 
OSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICOSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICJoshua L. Davis
 
Senior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesSenior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesJoshua L. Davis
 
The Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsThe Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsJoshua L. Davis
 
Barcamp: Open Source and Security
Barcamp: Open Source and SecurityBarcamp: Open Source and Security
Barcamp: Open Source and SecurityJoshua L. Davis
 
Innovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsInnovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsJoshua L. Davis
 
Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Joshua L. Davis
 
Mil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionMil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionJoshua L. Davis
 
DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)Joshua L. Davis
 

Destacado (14)

Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & AgileIgnite: Improving Performance on Federal Contracts Using Scrum & Agile
Ignite: Improving Performance on Federal Contracts Using Scrum & Agile
 
Ignite: YSANAOYOA
Ignite: YSANAOYOAIgnite: YSANAOYOA
Ignite: YSANAOYOA
 
The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0The Enterprise Guide to Drupal for Gov 2.0
The Enterprise Guide to Drupal for Gov 2.0
 
The Open Source Movement
The Open Source MovementThe Open Source Movement
The Open Source Movement
 
Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)Homeland Open Security Technologies (HOST)
Homeland Open Security Technologies (HOST)
 
Ignite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with RubyIgnite: Hackin' Excel with Ruby
Ignite: Hackin' Excel with Ruby
 
OSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/ICOSSIM and OMAR in the DoD/IC
OSSIM and OMAR in the DoD/IC
 
Senior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social TechnologiesSenior Leaders Adapting to Social Technologies
Senior Leaders Adapting to Social Technologies
 
The Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging ThreatsThe Next Generation Open IDS Engine Suricata and Emerging Threats
The Next Generation Open IDS Engine Suricata and Emerging Threats
 
Barcamp: Open Source and Security
Barcamp: Open Source and SecurityBarcamp: Open Source and Security
Barcamp: Open Source and Security
 
Innovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source SolutionsInnovation Through “Trusted” Open Source Solutions
Innovation Through “Trusted” Open Source Solutions
 
Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...Advancing open source geospatial software for the do d ic edward pickle openg...
Advancing open source geospatial software for the do d ic edward pickle openg...
 
Mil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC ConventionMil-OSS @ 47th Annual AOC Convention
Mil-OSS @ 47th Annual AOC Convention
 
DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)DISA's Open Source Corporate Management Information System (OSCMIS)
DISA's Open Source Corporate Management Information System (OSCMIS)
 

Similar a Big Bad PostgreSQL: BI on a Budget

Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGuillaume Laforge
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012patmisasi
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
mogpres
mogpresmogpres
mogpresxlight
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014Connor McDonald
 
MySQL overview
MySQL overviewMySQL overview
MySQL overviewMarco Tusa
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Entel
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...Ernie Souhrada
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUGFuad Arshad
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBAlluxio, Inc.
 
Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Alluxio, Inc.
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software BenchmarkAkira Shibata
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle OverviewPeter Doolan
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationTim Callaghan
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceIsaac Christoffersen
 
Sanger HPC infrastructure Report (2007)
Sanger HPC infrastructure  Report (2007)Sanger HPC infrastructure  Report (2007)
Sanger HPC infrastructure Report (2007)Guy Coates
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 

Similar a Big Bad PostgreSQL: BI on a Budget (20)

Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
mogpres
mogpresmogpres
mogpres
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014
 
MySQL overview
MySQL overviewMySQL overview
MySQL overview
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUG
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDB
 
Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...Scalable and High available Distributed File System Metadata Service Using gR...
Scalable and High available Distributed File System Metadata Service Using gR...
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software Benchmark
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle Overview
 
mogpres
mogpresmogpres
mogpres
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free Replication
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
 
Sanger HPC infrastructure Report (2007)
Sanger HPC infrastructure  Report (2007)Sanger HPC infrastructure  Report (2007)
Sanger HPC infrastructure Report (2007)
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 

Más de Joshua L. Davis

Ignite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareIgnite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareJoshua L. Davis
 
Using the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentUsing the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentJoshua L. Davis
 
Open Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityOpen Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityJoshua L. Davis
 
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesImportance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesJoshua L. Davis
 
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSOZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSJoshua L. Davis
 
Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Joshua L. Davis
 
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerReaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerJoshua L. Davis
 
USIP Open Simulation Platform
USIP Open Simulation PlatformUSIP Open Simulation Platform
USIP Open Simulation PlatformJoshua L. Davis
 
CONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthCONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthJoshua L. Davis
 
CompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteCompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteJoshua L. Davis
 
.org to .com: Going from Project to Product
.org to .com: Going from Project to Product.org to .com: Going from Project to Product
.org to .com: Going from Project to ProductJoshua L. Davis
 
Cyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureCyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureJoshua L. Davis
 
An Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerAn Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerJoshua L. Davis
 
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosHow and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosJoshua L. Davis
 

Más de Joshua L. Davis (16)

Ignite: Devops - Why Should You Care
Ignite: Devops - Why Should You CareIgnite: Devops - Why Should You Care
Ignite: Devops - Why Should You Care
 
Using the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting EnvironmentUsing the Joomla CMI in the Army Hosting Environment
Using the Joomla CMI in the Army Hosting Environment
 
Open Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and SecurityOpen Source Software (OSS/FLOSS) and Security
Open Source Software (OSS/FLOSS) and Security
 
SOSCOE Overview
SOSCOE OverviewSOSCOE Overview
SOSCOE Overview
 
milSuite
milSuitemilSuite
milSuite
 
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD EnterprisesImportance of WS-Addressing and WS-Reliability in DoD Enterprises
Importance of WS-Addressing and WS-Reliability in DoD Enterprises
 
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSSOZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
OZONE & OWF: A Community-wide GOTS initiative and its transition to GOSS
 
Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"Title TBD: "18 hundred seconds"
Title TBD: "18 hundred seconds"
 
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major PlayerReaching It's Potential: How to Make Government-Developed OSS A Major Player
Reaching It's Potential: How to Make Government-Developed OSS A Major Player
 
USIP Open Simulation Platform
USIP Open Simulation PlatformUSIP Open Simulation Platform
USIP Open Simulation Platform
 
CONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military HealthCONNECT: An Open Source Platform for Promoting Military Health
CONNECT: An Open Source Platform for Promoting Military Health
 
CompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuiteCompanyCommand & PlatoonLeader Forums and MilSuite
CompanyCommand & PlatoonLeader Forums and MilSuite
 
.org to .com: Going from Project to Product
.org to .com: Going from Project to Product.org to .com: Going from Project to Product
.org to .com: Going from Project to Product
 
Cyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical CultureCyber Challenges in a Hierarchical Culture
Cyber Challenges in a Hierarchical Culture
 
An Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL ServerAn Approach to Building & Maintaining a STIG'D RHEL Server
An Approach to Building & Maintaining a STIG'D RHEL Server
 
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield ScenariosHow and Why Python is Used in the Model of Real-World Battlefield Scenarios
How and Why Python is Used in the Model of Real-World Battlefield Scenarios
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Big Bad PostgreSQL: BI on a Budget

  • 1. Big Bad PostgreSQL: A Case Study Moving a “large,” “complicated,” and mission-critical datawarehouse from Oracle to PostgreSQL for cost control. 1 Monday, August 2, 2010
  • 2. Who am I? @postwait on twitter Author of “Scalable Internet Architectures” Pearson, ISBN: 067232699X CEO of OmniTI We build scalable and secure web applications I am an Engineer A practitioner of academic computing. IEEE member and Senior ACM member. On the Editorial Board of ACM’s Queue magazine. I work on/with a lot of Open Source software: Apache, perl, Linux, Solaris, PostgreSQL, Varnish, Spread, Reconnoiter, etc. I have experience. I’ve had the unique opportunity to watch a great many catastrophes. I enjoy immersing myself in the pathology of architecture failures. Monday, August 2, 2010
  • 3. Overall Architecture Oracle 8i OLTP instance: 0.5 TB 0.25 TB drives the site Hitachi JBOD OLTP Log import and Oracle 8i processing Oracle 8i 0.75 TB JBOD MySQL log importer 0.5 TB 1.5 TB Hitachi MTI OLTP warm backup Warm spare 1.2 TB SATA Datawarehouse RAID Log Importer MySQL 4.1 1.2 TB IDE RAID Data Exporter bulk selects / data exports Monday, August 2, 2010
  • 4. Database Situation • The problems: • The database is growing. • The OLTP and ODS/warehouse are too slow. • A lot of application code against the OLTP system. • Minimal application code against the ODS system. • Oracle: • Licensed per processor. • Really, really, really expensive on a large scale. • PostgreSQL: • No licensing costs. • Good support for complex queries. Monday, August 2, 2010
  • 5. Database Choices • Must keep Oracle on OLTP • Complex, Oracle-specific web application. • Need more processors. • ODS: Oracle not required. • Complex queries from limited sources. • Needs more space and power. • Result: • Move ODS Oracle licenses to OLTP • Run PostgreSQL on ODS Monday, August 2, 2010
  • 6. PostgreSQL gotchas • For an OLTP system that does thousands of updates per second, vacuuming is a hassle. • No upgrades?! • pg_migrator is coming along. • Less community experience with large databases. • Replication features less evolved • though PostgreSQL 9.0 ups the ante Monday, August 2, 2010
  • 7. PostgreSQL ♥ ODS • Mostly inserts. • Updates/Deletes controlled, not real-time. • pl/perl (leverage DBI/DBD for remote database connectivity). • Monster queries. • Extensible. Monday, August 2, 2010
  • 8. Choosing Linux • Popular, liked, good community support. • Chronic problems: • kernel panics • filesystems remounting read-only • filesystems don’t support snapshots • LVM is clunky on enterprise storage • 20 outages in 4 months Monday, August 2, 2010
  • 9. Choosing Solaris 10 • Switched to Solaris 10 • No crashes, better system-level tools. • prstat, iostat, vmstat, smf, fault- management. • ZFS • snapshots (persistent), BLI backups. • Excellent support for enterprise storage. • DTrace. • Free (too). Monday, August 2, 2010
  • 10. Oracle features we need • Partitioning • Statistics and Aggregations • rank over partition, lead, lag, etc. • Large selects (100GB) • Autonomous transactions • Replication from Oracle (to Oracle) Monday, August 2, 2010
  • 11. Partitioning For large data sets: pgods=# select count(1) from ods.ods_tblpick_super; count ------------ 3247286017 (1 row) • Next biggest tables: billions • Allows us to cluster data over specific ranges (by date in our case) • Simple, cheap archiving and removal of data. • Can put ranges used less often in different tablespaces (slower, cheaper storage) Monday, August 2, 2010
  • 12. Partitioning PostgreSQL style • PostgreSQL doesn’t support partition... • It supports inheritance... (what’s this?) • some crazy object-relation paradigm. • We can use it to implement partitioning: • One master table with no rows. • Child tables that have our partition constraints. • Rules on the master table for insert/update/delete. Monday, August 2, 2010
  • 13. Partitioning PostgreSQL realized • Cheaply add new empty partitions • Cheaply remove old partitions • Migrate less-often-accessed partitions to slower storage • Different indexes strategies per partition • PostgreSQL >8.1 supports constraint checking on inherited tables. • smarter planning • smarter executing Monday, August 2, 2010
  • 14. RANK OVER PARTITION • In Oracle: select userid, email from ( ! ! select u.userid, u.email, ! ! row_number() over (partition by u.email order by userid desc) as position ! ! from (...)) where position = 1 • In PostgreSQL: FOR v_row IN select u.userid, u.email from (...) order by email, userid desc LOOP ! IF v_row.email != v_last_email THEN ! ! RETURN NEXT v_row; ! ! v_last_email := v_row.email; ! ! v_rownum := v_rownum + 1; ! END IF; END LOOP; With 8.4, we have windowing functions Monday, August 2, 2010
  • 15. Large SELECTs • Application code does: select u.*, b.browser, m.lastmess from ods.ods_users u, ods.ods_browsers b, ( select userid, min(senddate) as senddate from ods.ods_maillog group by userid ) m, ods.ods_maillog l where u.userid = b.userid and u.userid = m.userid and u.userid = l.userid and l.senddate = m.senddate; • The width of these rows is about 2k • 50 million row return set • > 100 GB of data Monday, August 2, 2010
  • 16. The Large SELECT Problem • libpq will buffer the entire result in memory. • This affects language bindings (DBD::Pg). • This is an utterly deficient default behavior. • This can be avoided by using cursors • Requires the app to be PostgreSQL specific. • You open a cursor. • Then FETCH the row count you desire. Monday, August 2, 2010
  • 17. Big SELECTs the Postgres way The previous “big” query becomes: DECLARE CURSOR bigdump FOR select u.*, b.browser, m.lastmess from ods.ods_users u, ods.ods_browsers b, ( select userid, min(senddate) as senddate from ods.ods_maillog group by userid ) m, ods.ods_maillog l where u.userid = b.userid and u.userid = m.userid and u.userid = l.userid and l.senddate = m.senddate; Then, in a loop: FETCH FORWARD 10000 FROM bigdump; Monday, August 2, 2010
  • 18. Autonomous Transactions • In Oracle we have over 2000 custom stored procedures. • During these procedures, we like to: • COMMIT incrementally Useful for long transactions (update/delete) that need not be atomic -- incremental COMMITs. • start a new top-level txn that can COMMIT Useful for logging progress in a stored procedure so that you know how far you progessed and how long each step took even if it rolls back. Monday, August 2, 2010
  • 19. PostgreSQL shortcoming • PostgreSQL simply does not support Autonomous transactions and to quote core developers “that would be hard.” • When in doubt, use brute force. • Use pl/perl to use DBD::Pg to connect to ourselves (a new backend) and execute a new top-level transaction. Monday, August 2, 2010
  • 20. Replication • Cross vendor database replication isn’t too difficult. • Helps a lot when you can do it inside the database. • Using dbi-link (based on pl/perl and DBI) we can. • We can connect to any remote database. • INSERT into local tables directly from remote SELECT statements. [snapshots] • LOOP over remote SELECT statements and process them row-by-row. [replaying remote DML logs] Monday, August 2, 2010
  • 21. Replication (really) • Through a combination of snapshotting and DML replay we: • replicate over into over 2000 tables in PostgreSQL from Oracle • snapshot replication of 200 • DML replay logs for 1800 • PostgreSQL to Oracle is a bit harder • out-of-band export and imports Monday, August 2, 2010
  • 22. New Architecture • Master: Sun v890 and Hitachi AMS + warm standby running Oracle (1TB) • Logs: several customs running MySQL instances (2TB each) • ODS BI: 2x Sun v40 running PostgreSQL 8.3 (6TB on Sun JBODs on ZFS each) • ODS archive: 2x custom running PostgreSQL 8.3 (14TB internal storage on ZFS each) Monday, August 2, 2010
  • 23. PostgreSQL is Lacking • pg_dump is too intrusive. • Poor system-level instrumentation. • Poor methods to determine specific contention. • It relies on the operating system’s filesystem cache. (which make PostgreSQL inconsistent across it’s supported OS base) Monday, August 2, 2010
  • 24. Enter Solaris • Solaris is a UNIX from Sun Microsystems. • Is it different than other UNIX/UNIX-like systems? • Mostly it isn’t different (hence the term UNIX) • It does have extremely strong ABI backward compatibility. • It’s stable and works well on large machines. • Solaris 10 shakes things up a bit: • DTrace • ZFS • Zones Monday, August 2, 2010
  • 25. Solaris / ZFS • ZFS: Zettaback Filesystem. • 264 snapshots, 248 files/directory, 264 bytes/filesystem, 278 (256 ZiB) bytes in a pool, 264 devices/pool, 264 pools/system • Extremely cheap differential backups. • I have a 5 TB database, I need a backup! • No rollback in your database? What is this? MySQL? • No rollback in your filesystem? • ZFS has snapshots, rollback, clone and promote. • OMG! Life altering features. • Caveat: ZFS is slower than alternatives, by about 10% with tuning. Monday, August 2, 2010
  • 26. Solaris / Zones • Zones: Virtual Environments. • Shared kernel. • Can share filesystems. • Segregated processes and privileges. • No big deal for databases, right? But Wait! Monday, August 2, 2010
  • 27. Solaris / ZFS + Zones = Magic Juju https://labs.omniti.com/trac/pgsoltools/browser/trunk/pitr_clone/clonedb_startclone.sh • ZFS snapshot, clone, delegate to zone, boot and run. • When done, halt zone, destroy clone. • We get a point-in-time copy of our PostgreSQL database: • read-write, • low disk-space requirements, • NO LOCKS! Welcome back pg_dump, you don’t suck (as much) anymore. • Fast snapshot to usable copy time: • On our 20 GB database: 1 minute. • On our 1.2 TB database: 2 minutes. Monday, August 2, 2010
  • 28. ZFS: how I saved my soul. • Database crash. Bad. 1.2 TB of data... busted. The reason Robert Treat looks a bit older than he should. • xlogs corrupted. catalog indexes corrupted. • Fault? PostgreSQL bug? Bad memory? Who knows? • Trial & error on a 1.2 TB data set is a cruel experience. • In real-life, most recovery actions are destructive actions. • PostgreSQL is no different. • Rollback to last checkpoint (ZFS), hack postgres code, try, fail, repeat. Monday, August 2, 2010
  • 29. Let DTrace open your eyes • DTrace: Dynamic Tracing • Dynamically instrument “stuff” in the system: • system calls (like strace/truss/ktrace). • process/scheduler activity (on/off cpu, semaphores, conditions). • see signals sent and received. • trace kernel functions, networking. • watch I/O down to the disk. • user-space processes, each function... each machine instruction! • Add probes into apps where it makes sense to you. Monday, August 2, 2010
  • 30. Can you see what I see? • There is EXPLAIN... when that isn’t enough... • There is EXPLAIN ANALYZE... when that isn’t enough. • There is DTrace. ; dtrace -q -n ‘ postgresql*:::statement-start { self->query = copyinstr(arg0); self->ok=1; } io:::start /self->ok/ { @[self->query, args[0]->b_flags & B_READ ? "read" : "write", args[1]->dev_statname] = sum(args[0]->b_bcount); }’ dtrace: description 'postgres*:::statement-start' matched 14 probes ^C select count(1) from c2w_ods.tblusers where zipcode between 10000 and 11000; read sd1 16384 select division, sum(amount), avg(amount) from ods.billings where txn_timestamp between ‘2006-01-01 00:00:00’ and ‘2006-04-01 00:00:00’ group by division; read sd2 71647232 Monday, August 2, 2010
  • 31. OmniTI Labs / pgtreats • https://labs.omniti.com/labs/pgtreats • Where we stick out PostgreSQL goodies... • like pg_file_stress FILENAME/DBOBJECT READS WRITES # min avg max # min avg max alldata1__idx_remove_domain_external 1 12 12 12 398 0 0 0 slowdata1__pg_rewrite 1 12 12 12 0 0 0 0 slowdata1__pg_class_oid_index 1 0 0 0 0 0 0 0 slowdata1__pg_attribute 2 0 0 0 0 0 0 0 alldata1__mv_users 0 0 0 0 4 0 0 0 slowdata1__pg_statistic 1 0 0 0 0 0 0 0 slowdata1__pg_index 1 0 0 0 0 0 0 0 slowdata1__pg_index_indexrelid_index 1 0 0 0 0 0 0 0 alldata1__remove_domain_external 0 0 0 0 502 0 0 0 alldata1__promo_15_tb_full_2 19 0 0 0 11 0 0 0 slowdata1__pg_class_relname_nsp_index 2 0 0 0 0 0 0 0 alldata1__promo_177intaoltest_tb 0 0 0 0 1053 0 0 0 slowdata1__pg_attribute_relid_attnum_index 2 0 0 0 0 0 0 0 alldata1__promo_15_tb_full_2_pk 2 0 0 0 0 0 0 0 alldata1__all_mailable_2 1403 0 0 423 0 0 0 0 alldata1__mv_users_pkey 0 0 0 0 4 0 0 0 Monday, August 2, 2010
  • 32. Results • Move ODS Oracle licenses to OLTP • Run PostgreSQL on ODS • Save $800k in license costs. • Spend $100k in labor costs. • Learn a lot. Monday, August 2, 2010
  • 33. Thanks! • Thank you. • http://omniti.com/does/postgresql • We’re hiring, but only if you love: • lots of data on lots of disks on lots of big boxes • smart people • hard problems • more than one database technology (including PostgreSQL) • responsibility Monday, August 2, 2010