SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
The Right Read Optimization
is Actually Write Optimization
            Leif Walsh
        leif@tokutek.com




                           ®
The Right Read Optimization is Write Optimization

    Situation: I have some data.
     • I want to learn things about the world, so I put it in MySQL,
       and start querying it.
     • To learn more, I go out and get more data.

    New Situation: I have a lot of data.
     • My queries start to slow down, and I can’t run them all.
      ‣ I also happen to still be collecting data.

    Goal: Execute queries in real time against large,
    growing data sets.
     • We need to do some read optimization.

    Let’s see some ways to optimize reads.
                                                     Leif Walsh -- Write Optimization
2                                                                                       ®
The Right Read Optimization is Write Optimization

   Select via Index              Select via Table Scan
select d where 270 ≤ a ≤ 538   select d where 270 ≤ e ≤ 538




      key   value                    key   value

        a   b c d e                    a   b c d e




   An index with the right key lets you examine less data.
The Right Read Optimization is Write Optimization

    Selecting via an index can be slow, if it is coupled
    with point queries.

    select d where 270 ≤ b ≤ 538
                        main table                 index




        key   value                  key   value             key            value
         a    b c d e                 b     a                      c           a




                                                   Leif Walsh -- Write Optimization
4                                                                                     ®
The Right Read Optimization is Write Optimization

 Covering indexes can speed up queries.
   • Key contains all columns necessary to answer query.
select d where 270 ≤ b ≤ 538
                   main table                 covering index




  key   value                   key   value            key     value
    a    b c d e                 bd    a                   c    a




No need to do point queries if you have a covering index.
The Right Read Optimization is Write Optimization

    Indexes do read optimization.
     • Index instead of table scan.
     • Covering indexing instead of regular indexing.
     • See Zardosht’s “Understanding Indexing” talk for more.
      ‣ Avoid post-retrieval sorting in GROUP BY and ORDER BY queries.
      ‣ http://vimeo.com/26454091

    Queries run much faster with the proper indexes.
    The right read optimization is good indexing!
     • But, different queries need different indexes.
     • Typically you need lots of indexes for a single table.

    Optimizing reads with indexes slows down
    insertions.
                                                               Leif Walsh -- Write Optimization
6                                                                                                 ®
The Right Read Optimization is Write Optimization

    The case for write optimization is indexed insertion
    performance.
     • “I'm trying to create indexes on a table with 308 million rows. It took
       ~20 minutes to load the table but 10 days to build indexes on it.”
      ‣ MySQL bug #9544
     • “Select queries were slow until I added an index onto the timestamp
       field... Adding the index really helped our reporting, BUT now the
       inserts are taking forever.”
      ‣ Comment on mysqlperformanceblog.com
     • “They indexed their tables, they indexed them well, / And lo, did the
       queries run quick! / But that wasn’t the last of their troubles, to tell– /
       Their insertions, like molasses, ran thick.”
      ‣ Not Lewis Carroll

    Now, our problem is to optimize writes.
     • We need to understand how writes work in indexes.

                                                    Leif Walsh -- Write Optimization
7                                                                                      ®
B-tree Basics
B-trees are Fast at Sequential Inserts
    Sequential inserts in B-trees have near-optimal data
    locality.




       These B-tree nodes reside in   Insertions are into
                 memory               this leaf node



     • One disk I/O per leaf (which contains many inserts).
     • Sequential disk I/O.
     • Performance is disk-bandwidth limited.


                                                    Leif Walsh -- Write Optimization
9                                                                                      ®
B-Trees Are Slow at Ad Hoc Inserts
     High entropy inserts (e.g., random) in B-trees
     have poor data locality.
                                                          These B-tree nodes reside in
                                                                    memory




      • Most nodes are not in main memory.
      • Most insertions require a random disk I/O.
      • Performance is disk-seek limited.
      • ≤ 100 inserts/sec/disk (≤ 0.05% of disk bandwidth).


                                              Leif Walsh -- Write Optimization
10                                                                                       ®
Good Indexing is Hard With B-trees
     With multiple indexes, B-tree indexes are slow.
      • Secondary indexes are not built sequentially.
       ‣ If they have the same sort order as the primary key, why bother storing them?
      • For read optimization, we would like multiple secondary
        indexes per table.
      • So inserts become multiple random B-tree insertions.
      • That’s slow, so we can’t keep up with incoming data.

     We can’t run queries well without good indexes,
     but we can’t keep good indexes in B-trees.




                                                                 Leif Walsh -- Write Optimization
11                                                                                                  ®
The Right Read Optimization is Write Optimization

     People often don’t use enough indexes.
     They use simplistic schema.
      • Sequential inserts via an autoincrement key.
      • Few indexes, few covering indexes.         key value
                                Autoincrement key
                             (effectively a timestamp)                  t          a b c d e


     Then insertions are fast but queries are slow.
     Adding sophisticated indexes helps queries.
      • B-trees cannot afford to maintain them.

     If we speed up inserts, we can maintain the right
     indexes, and speed up queries.
                                                    Leif Walsh -- Write Optimization
12                                                                                             ®
13
The Right Read Optimization is Write Optimization

     Read Optimization Techniques
     Write Optimization is Necessary for Read
     Optimization
     Write Optimization Techniques
      • Insert Batching
       ‣ OLAP
      • Bureaucratic Insert Batching
       ‣ LSM Trees
      • How the Post Office Does Write Optimization
       ‣ Fractal Trees




                                         Leif Walsh -- Write Optimization
14                                                                          ®
Reformulating The Problem
     Random insertions into a B-tree are slow
     because:
      • Disk seeks are very slow.
      • B-trees incur a disk seek for every insert.

     Here is another way to think about it:
      • B-trees only accomplish one insert per disk seek.

     A simpler problem:
      • Can we get B-trees to do more useful work per disk seek?




                                              Leif Walsh -- Write Optimization
15                                                                               ®
Insert Batching
Insert Batching
     Recall that sequential insertions are faster than
     random insertions.
      • The argument before holds for empty trees.
      • But even for existing trees, you can bunch of a set of
        insertions (say, a day’s worth) and:
       ‣ Sort them
       ‣ Insert them in sorted order
      • Inserting batches in sorted order is faster when you end up
        with multiple insertions in the same leaf.
      • This happens a lot in practice, so batch-sort-and-insert is
        standard practice.



                                             Leif Walsh -- Write Optimization
17                                                                              ®
Insert Batching Example
     Here’s a typical B-tree scenario:
      • 1 billion 160-byte rows = 160GB
      • 16KB page size
      • 16GB main memory available

     That means:
      • Each leaf contains 100 rows.
      • There are 10 million leaves.
      • At most (16GB / 160GB) = 10% of the leaves fit in RAM.
       ‣ So most leaf accesses require a disk seek.




                                                      Leif Walsh -- Write Optimization
18                                                                                       ®
Insert Batching Example
     Back of the envelope analysis:
      • Let’s batch 16GB of data (100 million rows).
       ‣ Then sort them and insert them into the B-tree.
      • That’s 10% of our total data size, and each leaf has 100 rows,
        so each leaf has about 10 row modifications headed for it.
      • Each disk seek accomplishes 10 inserts (instead of just one).
      • So we get about 10x throughput.

     But we had to batch a lot of rows to get there.
      • Since these are stored unindexed on disk, we can’t query
        them.
      • If we had 10 billion rows (1.6TB), we would have had to save
        1 billion inserts just to get 10x insertion speed.

                                                           Leif Walsh -- Write Optimization
19                                                                                            ®
Insert Batching Results
     OLAP is insert batching.
      • The key is to batch a constant fraction of your DB size.
       ‣ Otherwise, the math doesn’t work out right.

     Advantages
      • Get plenty of throughput from a very simple idea.
        ‣ 10x in our example, more if you have bigger leaves.

     Disadvantages
      • Data latency: data arrives for insertion, but isn’t available to
        queries until the batch is inserted.
       ‣ The bigger the DB, the bigger the batches need to be, and the more latency you experience.




                                                                Leif Walsh -- Write Optimization
20                                                                                                    ®
Learning From OLAP’s Disadvantages
     We got latency because:
      • Our data didn’t get indexed right away, it just sat on disk.
      • Without an index, we can’t query that data.

     We could index the buffer.
      • But we need to make sure we don’t lose the speed boost.




                                              Leif Walsh -- Write Optimization
21                                                                               ®
Learning From OLAP’s Disadvantages
     Let’s try it:
      • One main B-tree on disk.
      • Another smaller B-tree, as the buffer.
       ‣ Maximum size is a constant fraction of the main B-tree’s size.
      • Inserts go first to the small B-tree.
      • When the small B-tree is big enough, merge it with the larger
        B-tree.
      • Queries need to be done on both trees, but at least all the
        data can be queried immediately.

     It looks like we solved the latency problem.



                                                                  Leif Walsh -- Write Optimization
22                                                                                                   ®
If At First You Don’t Succeed, Recurse
     We didn’t maintain our speed boost.
      • At first, the smaller B-tree fits in memory, so inserts are fast.
      • When your DB grows, the smaller tree must grow too.
       ‣ Otherwise, you lose the benefit of batching – remember, you need a constant fraction like 10%.
      • Eventually, even the small B-tree is too big for memory.
      • Now we can’t insert into the small B-tree fast enough.

     Try the same trick again:
      • Stick an insert buffer in front of the small B-tree.
      • But now you get latency, so index the new buffer.
      • ...

     This brings us to our next write optimization.
                                                                Leif Walsh -- Write Optimization
23                                                                                                       ®
LSM Trees
LSM Trees
     Generalizing the OLAP technique:
      • Maintain a hierarchy of B-trees: B0, B1, B2, ...
       ‣ Bk is the insert buffer for Bk+1.
      • The maximum size of Bk+1 is twice that of Bk.
       ‣ “Twice” is a simple choice but it’s not fixed.
      • When Bk gets full, merge it down to Bk+1, and empty Bk.
      • These merges can cascade down multiple levels.

     This is called a Log-Structured Merge Tree.




                                                         Leif Walsh -- Write Optimization
25                                                                                          ®
LSM Trees
     Visualizing the LSM Tree
        • B-trees are a bit like arrays, the way we use them here.
          ‣ If we simplify things a tiny bit, all we do is merge B-trees, which is fast.
          ‣ Merging sorted arrays is fast too (mergesort uses this).



                 20         21                      22                              23

        • Bk’s maximum size is 2k.
        • The first few levels* are just in memory.




     * If memory size is M, that’s log2(M) levels



                                                                         Leif Walsh -- Write Optimization
26                                                                                                          ®
LSM Tree Demonstration
LSM Tree Insertion Performance
     LSM Trees use I/O efficiently.
      • Each merge is 50% of the receiving tree’s size.
      • So each disk seek done during a merge accomplishes half as
        many inserts as fit in a page (that’s a lot).
       ‣ In our earlier example, that’s 50 inserts per disk seek.
      • But there are log2(n) - log2(M) levels on disk, so each insert
        needs to get written that many times.
       ‣ That would be ~3 times.
      • Overall, we win because the boost we get from batching our
        inserts well overwhelms the pain of writing data multiple times.
       ‣ Our database would get about a 16x throughput boost.

     LSM Trees have very good insertion performance.


                                                                    Leif Walsh -- Write Optimization
28                                                                                                     ®
LSM Tree Query Performance
     LSM Trees do a full B-tree search once per level.
      • B-tree searches are pretty fast, but they do incur at least one
        disk seek.
      • LSM trees do lots of searches, and each one costs at least
        one disk seek.

     Queries in LSM trees are much slower than in B-
     trees.
      • Asymptotically, they’re a factor of log(n) slower.




                                              Leif Walsh -- Write Optimization
29                                                                               ®
LSM Tree Results
     Advantages
      • Data is available for query immediately.
      • Insertions are very fast.

     Disadvantages
      • Queries take a nasty hit.

     LSM trees are almost what we need.
      • They can keep up with large data sets with multiple
        secondary indexes and high insertion rates.
      • But the indexes you keep aren’t as effective for queries. We
        lost some of our read optimization.


                                             Leif Walsh -- Write Optimization
30                                                                              ®
Fractal   Tree®   Indexes
Getting the Best of Both Worlds
     LSM Trees have one big structure per level.
      • But that means you have to do a global search in each level.

     B-trees have many smaller structures in each
     level.
      • So on each level, you only do a small amount of work.

     A Fractal   Tree ®   Index is the best of both worlds.
      • Topologically, it looks like a B-tree, so searches are fast.
      • But it also buffers like an LSM Tree, so inserts are fast.




                                               Leif Walsh -- Write Optimization
32                                                                                ®
Building a Fractal                    Tree ®              Index
     Start with a B-tree.
     Put an unindexed buffer (of size B) at each node.
      • These buffers are small, so they don’t introduce data latency.

     Insertions go to the root node’s buffer.
     When a buffer gets full, flush it down the tree.
      • Move its elements to the buffers on the child nodes.
      • This may cause some child buffers to flush.

     Searches look at each buffer going to a leaf.
      • But they can ignore all the rest of the data at that depth in
        the tree.

                                              Leif Walsh -- Write Optimization
33                                                                                       ®
Fractal     Tree®        Index Insertion Performance
     Cost to flush a buffer: O(1).
     Cost to flush a buffer, per element: O(1/B).
      • We move B elements when we flush a buffer.

     # of flushes per element: O(log(N)).
      • That’s just the height of the tree – when the element gets to a leaf node, it’s
        done moving.

     Cost to flush an element all the way down:
     O(log(N)) * O(1/B) = O(log(N) / B).
      • (Full cost to insert an element)
      • By comparison, B-tree insertions are O(logB(N)) =
        O(log(N) / log(B)).

     Fractal Tree Indexes have very good insertion performance.
      • As good as LSM Trees.

                                                       Leif Walsh -- Write Optimization
34                                                                                        ®
Fractal          Tree®          Index Query Performance
     Fractal Tree searches are the same as B-tree
     searches.
      • Takes a little more CPU to look at the buffers, but the same
        # of disk seeks.
       ‣ There are some choices to make here, about caching and expected workloads, but they don’t
        affect the asymptotic performance.


     So Fractal Trees have great query performance.




                                                              Leif Walsh -- Write Optimization
35                                                                                                   ®
Fractal               Tree®        Index Results
     Advantages
      • Insertion performance is great.
       ‣   We can keep all the indexes we need.

      • Query performance is great.
       ‣   Our indexes are as effective as they would be with B-trees.

     Disadvantages
      • Introduces more dependence between tree nodes.
       ‣   Concurrency is harder.

      • Insert/search imbalance: inserts are a lot cheaper than searches, only as
        long as inserts don’t require a search first.
       ‣   Watch out for uniqueness checks.


     Other benefits
      • Can afford to increase the block size.
       ‣   Better compression, no fragmentation.

      • Can play tricks with “messages” that update multiple rows.
       ‣   HCAD, HI, HOT (online DDL).


                                                                         Leif Walsh -- Write Optimization
36                                                                                                          ®
Thanks!
Come see our booth and our lightning talk

            leif@tokutek.com




                               ®

Más contenido relacionado

La actualidad más candente

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQLDatabricks
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroDatabricks
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLDatabricks
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark Summit
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013larsgeorge
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllMichael Mior
 
Rethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsRethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsYingjun Wu
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Databricks
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateMichael Rainey
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsDatabricks
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexingYoshinori Matsunobu
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 

La actualidad más candente (20)

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them All
 
Rethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming SystemsRethinking State Management in Cloud-Native Streaming Systems
Rethinking State Management in Cloud-Native Streaming Systems
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexing
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 

Destacado

Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2james tong
 
数据库系统设计漫谈
数据库系统设计漫谈数据库系统设计漫谈
数据库系统设计漫谈james tong
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
我们的MySQL
我们的MySQL我们的MySQL
我们的MySQLJinrong Ye
 
程序猿都该知道的MySQL秘籍
程序猿都该知道的MySQL秘籍程序猿都该知道的MySQL秘籍
程序猿都该知道的MySQL秘籍Jinrong Ye
 
MySQL数据库设计、优化
MySQL数据库设计、优化MySQL数据库设计、优化
MySQL数据库设计、优化Jinrong Ye
 

Destacado (9)

Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2
 
数据库系统设计漫谈
数据库系统设计漫谈数据库系统设计漫谈
数据库系统设计漫谈
 
LSMの壁
LSMの壁LSMの壁
LSMの壁
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
LSM Trees
LSM TreesLSM Trees
LSM Trees
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
我们的MySQL
我们的MySQL我们的MySQL
我们的MySQL
 
程序猿都该知道的MySQL秘籍
程序猿都该知道的MySQL秘籍程序猿都该知道的MySQL秘籍
程序猿都该知道的MySQL秘籍
 
MySQL数据库设计、优化
MySQL数据库设计、优化MySQL数据库设计、优化
MySQL数据库设计、优化
 

Similar a The right read optimization is actually write optimization

Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structuresconfluent
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the LogBen Stopford
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?DATAVERSITY
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
Top 10 Application Problems
Top 10 Application ProblemsTop 10 Application Problems
Top 10 Application ProblemsAppDynamics
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012Daum DNA
 
Information retrieval dynamic indexing
Information retrieval dynamic indexingInformation retrieval dynamic indexing
Information retrieval dynamic indexingNadia Nahar
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQLTony Tam
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010jbellis
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 
Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)guest0f8e278
 
Fractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingFractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingAtner Yegorov
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive QueriesOwen O'Malley
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Spark Summit
 
Hekaton introduction for .Net developers
Hekaton introduction for .Net developersHekaton introduction for .Net developers
Hekaton introduction for .Net developersShy Engelberg
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive WritesLiran Zelkha
 
ZFS by PWR 2013
ZFS by PWR 2013ZFS by PWR 2013
ZFS by PWR 2013pwrsoft
 

Similar a The right read optimization is actually write optimization (20)

Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
Top 10 Application Problems
Top 10 Application ProblemsTop 10 Application Problems
Top 10 Application Problems
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
 
Information retrieval dynamic indexing
Information retrieval dynamic indexingInformation retrieval dynamic indexing
Information retrieval dynamic indexing
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
Five steps perform_2013
Five steps perform_2013Five steps perform_2013
Five steps perform_2013
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Tunning overview
Tunning overviewTunning overview
Tunning overview
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
 
Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)
 
Fractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexingFractal tree-technology-and-the-art-of-indexing
Fractal tree-technology-and-the-art-of-indexing
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
 
Hekaton introduction for .Net developers
Hekaton introduction for .Net developersHekaton introduction for .Net developers
Hekaton introduction for .Net developers
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive Writes
 
ZFS by PWR 2013
ZFS by PWR 2013ZFS by PWR 2013
ZFS by PWR 2013
 

Más de james tong

Migrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLMigrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLjames tong
 
Oracle 性能优化
Oracle 性能优化Oracle 性能优化
Oracle 性能优化james tong
 
Cap 理论与实践
Cap 理论与实践Cap 理论与实践
Cap 理论与实践james tong
 
Scalable system operations presentation
Scalable system operations presentationScalable system operations presentation
Scalable system operations presentationjames tong
 
Benchmarks, performance, scalability, and capacity what s behind the numbers...
Benchmarks, performance, scalability, and capacity  what s behind the numbers...Benchmarks, performance, scalability, and capacity  what s behind the numbers...
Benchmarks, performance, scalability, and capacity what s behind the numbers...james tong
 
Stability patterns presentation
Stability patterns presentationStability patterns presentation
Stability patterns presentationjames tong
 
My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012james tong
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorialjames tong
 
Understanding performance through_measurement
Understanding performance through_measurementUnderstanding performance through_measurement
Understanding performance through_measurementjames tong
 
我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)james tong
 
设计可扩展的Oracle应用
设计可扩展的Oracle应用设计可扩展的Oracle应用
设计可扩展的Oracle应用james tong
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptxjames tong
 
Enqueue Lock介绍.ppt
Enqueue Lock介绍.pptEnqueue Lock介绍.ppt
Enqueue Lock介绍.pptjames tong
 
Oracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptOracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptjames tong
 
Cassandra简介.ppt
Cassandra简介.pptCassandra简介.ppt
Cassandra简介.pptjames tong
 

Más de james tong (15)

Migrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLMigrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQL
 
Oracle 性能优化
Oracle 性能优化Oracle 性能优化
Oracle 性能优化
 
Cap 理论与实践
Cap 理论与实践Cap 理论与实践
Cap 理论与实践
 
Scalable system operations presentation
Scalable system operations presentationScalable system operations presentation
Scalable system operations presentation
 
Benchmarks, performance, scalability, and capacity what s behind the numbers...
Benchmarks, performance, scalability, and capacity  what s behind the numbers...Benchmarks, performance, scalability, and capacity  what s behind the numbers...
Benchmarks, performance, scalability, and capacity what s behind the numbers...
 
Stability patterns presentation
Stability patterns presentationStability patterns presentation
Stability patterns presentation
 
My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorial
 
Understanding performance through_measurement
Understanding performance through_measurementUnderstanding performance through_measurement
Understanding performance through_measurement
 
我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)
 
设计可扩展的Oracle应用
设计可扩展的Oracle应用设计可扩展的Oracle应用
设计可扩展的Oracle应用
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx
 
Enqueue Lock介绍.ppt
Enqueue Lock介绍.pptEnqueue Lock介绍.ppt
Enqueue Lock介绍.ppt
 
Oracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptOracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.ppt
 
Cassandra简介.ppt
Cassandra简介.pptCassandra简介.ppt
Cassandra简介.ppt
 

Último

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Último (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

The right read optimization is actually write optimization

  • 1. The Right Read Optimization is Actually Write Optimization Leif Walsh leif@tokutek.com ®
  • 2. The Right Read Optimization is Write Optimization Situation: I have some data. • I want to learn things about the world, so I put it in MySQL, and start querying it. • To learn more, I go out and get more data. New Situation: I have a lot of data. • My queries start to slow down, and I can’t run them all. ‣ I also happen to still be collecting data. Goal: Execute queries in real time against large, growing data sets. • We need to do some read optimization. Let’s see some ways to optimize reads. Leif Walsh -- Write Optimization 2 ®
  • 3. The Right Read Optimization is Write Optimization Select via Index Select via Table Scan select d where 270 ≤ a ≤ 538 select d where 270 ≤ e ≤ 538 key value key value a b c d e a b c d e An index with the right key lets you examine less data.
  • 4. The Right Read Optimization is Write Optimization Selecting via an index can be slow, if it is coupled with point queries. select d where 270 ≤ b ≤ 538 main table index key value key value key value a b c d e b a c a Leif Walsh -- Write Optimization 4 ®
  • 5. The Right Read Optimization is Write Optimization Covering indexes can speed up queries. • Key contains all columns necessary to answer query. select d where 270 ≤ b ≤ 538 main table covering index key value key value key value a b c d e bd a c a No need to do point queries if you have a covering index.
  • 6. The Right Read Optimization is Write Optimization Indexes do read optimization. • Index instead of table scan. • Covering indexing instead of regular indexing. • See Zardosht’s “Understanding Indexing” talk for more. ‣ Avoid post-retrieval sorting in GROUP BY and ORDER BY queries. ‣ http://vimeo.com/26454091 Queries run much faster with the proper indexes. The right read optimization is good indexing! • But, different queries need different indexes. • Typically you need lots of indexes for a single table. Optimizing reads with indexes slows down insertions. Leif Walsh -- Write Optimization 6 ®
  • 7. The Right Read Optimization is Write Optimization The case for write optimization is indexed insertion performance. • “I'm trying to create indexes on a table with 308 million rows. It took ~20 minutes to load the table but 10 days to build indexes on it.” ‣ MySQL bug #9544 • “Select queries were slow until I added an index onto the timestamp field... Adding the index really helped our reporting, BUT now the inserts are taking forever.” ‣ Comment on mysqlperformanceblog.com • “They indexed their tables, they indexed them well, / And lo, did the queries run quick! / But that wasn’t the last of their troubles, to tell– / Their insertions, like molasses, ran thick.” ‣ Not Lewis Carroll Now, our problem is to optimize writes. • We need to understand how writes work in indexes. Leif Walsh -- Write Optimization 7 ®
  • 9. B-trees are Fast at Sequential Inserts Sequential inserts in B-trees have near-optimal data locality. These B-tree nodes reside in Insertions are into memory this leaf node • One disk I/O per leaf (which contains many inserts). • Sequential disk I/O. • Performance is disk-bandwidth limited. Leif Walsh -- Write Optimization 9 ®
  • 10. B-Trees Are Slow at Ad Hoc Inserts High entropy inserts (e.g., random) in B-trees have poor data locality. These B-tree nodes reside in memory • Most nodes are not in main memory. • Most insertions require a random disk I/O. • Performance is disk-seek limited. • ≤ 100 inserts/sec/disk (≤ 0.05% of disk bandwidth). Leif Walsh -- Write Optimization 10 ®
  • 11. Good Indexing is Hard With B-trees With multiple indexes, B-tree indexes are slow. • Secondary indexes are not built sequentially. ‣ If they have the same sort order as the primary key, why bother storing them? • For read optimization, we would like multiple secondary indexes per table. • So inserts become multiple random B-tree insertions. • That’s slow, so we can’t keep up with incoming data. We can’t run queries well without good indexes, but we can’t keep good indexes in B-trees. Leif Walsh -- Write Optimization 11 ®
  • 12. The Right Read Optimization is Write Optimization People often don’t use enough indexes. They use simplistic schema. • Sequential inserts via an autoincrement key. • Few indexes, few covering indexes. key value Autoincrement key (effectively a timestamp) t a b c d e Then insertions are fast but queries are slow. Adding sophisticated indexes helps queries. • B-trees cannot afford to maintain them. If we speed up inserts, we can maintain the right indexes, and speed up queries. Leif Walsh -- Write Optimization 12 ®
  • 13. 13
  • 14. The Right Read Optimization is Write Optimization Read Optimization Techniques Write Optimization is Necessary for Read Optimization Write Optimization Techniques • Insert Batching ‣ OLAP • Bureaucratic Insert Batching ‣ LSM Trees • How the Post Office Does Write Optimization ‣ Fractal Trees Leif Walsh -- Write Optimization 14 ®
  • 15. Reformulating The Problem Random insertions into a B-tree are slow because: • Disk seeks are very slow. • B-trees incur a disk seek for every insert. Here is another way to think about it: • B-trees only accomplish one insert per disk seek. A simpler problem: • Can we get B-trees to do more useful work per disk seek? Leif Walsh -- Write Optimization 15 ®
  • 17. Insert Batching Recall that sequential insertions are faster than random insertions. • The argument before holds for empty trees. • But even for existing trees, you can bunch of a set of insertions (say, a day’s worth) and: ‣ Sort them ‣ Insert them in sorted order • Inserting batches in sorted order is faster when you end up with multiple insertions in the same leaf. • This happens a lot in practice, so batch-sort-and-insert is standard practice. Leif Walsh -- Write Optimization 17 ®
  • 18. Insert Batching Example Here’s a typical B-tree scenario: • 1 billion 160-byte rows = 160GB • 16KB page size • 16GB main memory available That means: • Each leaf contains 100 rows. • There are 10 million leaves. • At most (16GB / 160GB) = 10% of the leaves fit in RAM. ‣ So most leaf accesses require a disk seek. Leif Walsh -- Write Optimization 18 ®
  • 19. Insert Batching Example Back of the envelope analysis: • Let’s batch 16GB of data (100 million rows). ‣ Then sort them and insert them into the B-tree. • That’s 10% of our total data size, and each leaf has 100 rows, so each leaf has about 10 row modifications headed for it. • Each disk seek accomplishes 10 inserts (instead of just one). • So we get about 10x throughput. But we had to batch a lot of rows to get there. • Since these are stored unindexed on disk, we can’t query them. • If we had 10 billion rows (1.6TB), we would have had to save 1 billion inserts just to get 10x insertion speed. Leif Walsh -- Write Optimization 19 ®
  • 20. Insert Batching Results OLAP is insert batching. • The key is to batch a constant fraction of your DB size. ‣ Otherwise, the math doesn’t work out right. Advantages • Get plenty of throughput from a very simple idea. ‣ 10x in our example, more if you have bigger leaves. Disadvantages • Data latency: data arrives for insertion, but isn’t available to queries until the batch is inserted. ‣ The bigger the DB, the bigger the batches need to be, and the more latency you experience. Leif Walsh -- Write Optimization 20 ®
  • 21. Learning From OLAP’s Disadvantages We got latency because: • Our data didn’t get indexed right away, it just sat on disk. • Without an index, we can’t query that data. We could index the buffer. • But we need to make sure we don’t lose the speed boost. Leif Walsh -- Write Optimization 21 ®
  • 22. Learning From OLAP’s Disadvantages Let’s try it: • One main B-tree on disk. • Another smaller B-tree, as the buffer. ‣ Maximum size is a constant fraction of the main B-tree’s size. • Inserts go first to the small B-tree. • When the small B-tree is big enough, merge it with the larger B-tree. • Queries need to be done on both trees, but at least all the data can be queried immediately. It looks like we solved the latency problem. Leif Walsh -- Write Optimization 22 ®
  • 23. If At First You Don’t Succeed, Recurse We didn’t maintain our speed boost. • At first, the smaller B-tree fits in memory, so inserts are fast. • When your DB grows, the smaller tree must grow too. ‣ Otherwise, you lose the benefit of batching – remember, you need a constant fraction like 10%. • Eventually, even the small B-tree is too big for memory. • Now we can’t insert into the small B-tree fast enough. Try the same trick again: • Stick an insert buffer in front of the small B-tree. • But now you get latency, so index the new buffer. • ... This brings us to our next write optimization. Leif Walsh -- Write Optimization 23 ®
  • 25. LSM Trees Generalizing the OLAP technique: • Maintain a hierarchy of B-trees: B0, B1, B2, ... ‣ Bk is the insert buffer for Bk+1. • The maximum size of Bk+1 is twice that of Bk. ‣ “Twice” is a simple choice but it’s not fixed. • When Bk gets full, merge it down to Bk+1, and empty Bk. • These merges can cascade down multiple levels. This is called a Log-Structured Merge Tree. Leif Walsh -- Write Optimization 25 ®
  • 26. LSM Trees Visualizing the LSM Tree • B-trees are a bit like arrays, the way we use them here. ‣ If we simplify things a tiny bit, all we do is merge B-trees, which is fast. ‣ Merging sorted arrays is fast too (mergesort uses this). 20 21 22 23 • Bk’s maximum size is 2k. • The first few levels* are just in memory. * If memory size is M, that’s log2(M) levels Leif Walsh -- Write Optimization 26 ®
  • 28. LSM Tree Insertion Performance LSM Trees use I/O efficiently. • Each merge is 50% of the receiving tree’s size. • So each disk seek done during a merge accomplishes half as many inserts as fit in a page (that’s a lot). ‣ In our earlier example, that’s 50 inserts per disk seek. • But there are log2(n) - log2(M) levels on disk, so each insert needs to get written that many times. ‣ That would be ~3 times. • Overall, we win because the boost we get from batching our inserts well overwhelms the pain of writing data multiple times. ‣ Our database would get about a 16x throughput boost. LSM Trees have very good insertion performance. Leif Walsh -- Write Optimization 28 ®
  • 29. LSM Tree Query Performance LSM Trees do a full B-tree search once per level. • B-tree searches are pretty fast, but they do incur at least one disk seek. • LSM trees do lots of searches, and each one costs at least one disk seek. Queries in LSM trees are much slower than in B- trees. • Asymptotically, they’re a factor of log(n) slower. Leif Walsh -- Write Optimization 29 ®
  • 30. LSM Tree Results Advantages • Data is available for query immediately. • Insertions are very fast. Disadvantages • Queries take a nasty hit. LSM trees are almost what we need. • They can keep up with large data sets with multiple secondary indexes and high insertion rates. • But the indexes you keep aren’t as effective for queries. We lost some of our read optimization. Leif Walsh -- Write Optimization 30 ®
  • 31. Fractal Tree® Indexes
  • 32. Getting the Best of Both Worlds LSM Trees have one big structure per level. • But that means you have to do a global search in each level. B-trees have many smaller structures in each level. • So on each level, you only do a small amount of work. A Fractal Tree ® Index is the best of both worlds. • Topologically, it looks like a B-tree, so searches are fast. • But it also buffers like an LSM Tree, so inserts are fast. Leif Walsh -- Write Optimization 32 ®
  • 33. Building a Fractal Tree ® Index Start with a B-tree. Put an unindexed buffer (of size B) at each node. • These buffers are small, so they don’t introduce data latency. Insertions go to the root node’s buffer. When a buffer gets full, flush it down the tree. • Move its elements to the buffers on the child nodes. • This may cause some child buffers to flush. Searches look at each buffer going to a leaf. • But they can ignore all the rest of the data at that depth in the tree. Leif Walsh -- Write Optimization 33 ®
  • 34. Fractal Tree® Index Insertion Performance Cost to flush a buffer: O(1). Cost to flush a buffer, per element: O(1/B). • We move B elements when we flush a buffer. # of flushes per element: O(log(N)). • That’s just the height of the tree – when the element gets to a leaf node, it’s done moving. Cost to flush an element all the way down: O(log(N)) * O(1/B) = O(log(N) / B). • (Full cost to insert an element) • By comparison, B-tree insertions are O(logB(N)) = O(log(N) / log(B)). Fractal Tree Indexes have very good insertion performance. • As good as LSM Trees. Leif Walsh -- Write Optimization 34 ®
  • 35. Fractal Tree® Index Query Performance Fractal Tree searches are the same as B-tree searches. • Takes a little more CPU to look at the buffers, but the same # of disk seeks. ‣ There are some choices to make here, about caching and expected workloads, but they don’t affect the asymptotic performance. So Fractal Trees have great query performance. Leif Walsh -- Write Optimization 35 ®
  • 36. Fractal Tree® Index Results Advantages • Insertion performance is great. ‣ We can keep all the indexes we need. • Query performance is great. ‣ Our indexes are as effective as they would be with B-trees. Disadvantages • Introduces more dependence between tree nodes. ‣ Concurrency is harder. • Insert/search imbalance: inserts are a lot cheaper than searches, only as long as inserts don’t require a search first. ‣ Watch out for uniqueness checks. Other benefits • Can afford to increase the block size. ‣ Better compression, no fragmentation. • Can play tricks with “messages” that update multiple rows. ‣ HCAD, HI, HOT (online DDL). Leif Walsh -- Write Optimization 36 ®
  • 37. Thanks! Come see our booth and our lightning talk leif@tokutek.com ®