SlideShare una empresa de Scribd logo
1 de 32
Myths of big partitions
Robert Stupp
Solution Architect @ DataStax, C*-Committer
@snazy
Issues with big partitions before 3.6
• Slow reads
• Compaction failures
• Repair failures
• java.lang.OutOfMemoryError
 fail fast  node down
(Lot of org.apache.cassandra.io.sstable.IndexInfo on heap)
© DataStax, All Rights Reserved. 2
SSTable Components
© DataStax, All Rights Reserved. 3
Data
Primary
Index
Summary
Bloom
Filter
Determine whether an SSTable contains a partition
 bloomFilterFpChance
Partition samples
 minIndexInterval / maxIndexInterval
All partition keys + index samples
 column_index_size_in_kb
All the data
Read from an SSTable
© DataStax, All Rights Reserved. 4
Data
Primary
Index
Summary
Bloom
Filter
1. Check whether partition is in SSTable
2. Find “nearest” partition key
3. Return offset in primary index
4. Find partition
5. Find clustering key
6. Return offset in data file
7. Find, read and return data
Before CASSANDRA-11206
Evaluation of SSTable Components
© DataStax, All Rights Reserved. 6
Data
Primary
Index
Summary
Bloom
Filter
Off-Heap, small  fine
Off-Heap, small-ish  fine
On-Heap,
many small objects, nested structure
 problematic
For CQL since #8099  fine
Primary Index File Layout
© DataStax, All Rights Reserved.
Partition Index SamplesPartition Key Partition Index SamplesPartition Key
es Partition Index SamplesPartition Key Partition Index SPartition Key
Samples Partition Index SamplesPartition Key PartitionPartition Key
Index Samples Partition Index SamplesPartition Key PPartition Key
artition Index Samples Partition Index SamplesPartition Key Partition Key
Partition Index Samples
”from”
Summary
Sampling the Primary Index
© DataStax, All Rights Reserved.
Partition in Data file
Partition Key
Offset in SSTable Data File
column_index_size_in_kb (default: 64kB)
First
Key
Last
Key
First
Key
Last
Key
First
Key
Last
Key
First
Key
Last
Key
First
Key
Last
Key
First
Key
Last
Key
First
Key
Last
Key
How it looks on-heap
© DataStax, All Rights Reserved. 10
IndexedEntry
IndexInfo
firstKey, lastKey, offset, width, deletionInfo
patitionKey*, offset, deletionInfo
* = technically not in IndexedEntry
IndexInfo
firstKey, lastKey, offset, width, deletionInfo
IndexInfo
firstKey, lastKey, offset, width, deletionInfo
…
Primary Index
Structure
© DataStax, All Rights Reserved. 11
IndexedEntry extends RowIndexEntry
DeletionTime
ArrayList
IndexInfo  per 64kB
DeletionTime
BufferClustering
Kind
ByteBuffer[]
ByteBuffer
byte[]
…
BufferClustering
Kind
ByteBuffer[]
ByteBuffer
byte[]
…
# of Java objects:
IndexedEntry
4
IndexInfo (per 64kB)
8 + 4 * clust-key-components
(primitive fields omitted)
Primary Index - some numbers
© DataStax, All Rights Reserved. 12
Approximation on one 16 byte clustering-value:
Partition Size Index Size (heap) # of objects
1MB 3kB > 200 objects
4MB 11kB > 800 objects
64MB 180kB > 13,000 objects
512MB 1.4MB > 106,000 objects
2048MB 5.6MB > 424,000 objects
Disclaimer: numbers are examples and not representative
Reads
• Reads IndexedEntry w/ all IndexInfo
• 2GB partition means: 32,768 IndexInfo,
424,000 objects
• Binary search just needs: 15 IndexInfo (max),
O(log n) ~200 objects
© DataStax, All Rights Reserved. 14 Disclaimer: numbers are examples and not representative
SELECT foo, bar
FROM big_partition_table
WHERE ...
Writes – Flushes & Compactions
IndexedEntry constructed with all IndexInfo
as Java object structure on heap first,
then serialized to disk
© DataStax, All Rights Reserved. 15
106,000
objects
106,000
objects
106,000
objects
106,000
objects
Compacting a 2GB partition
© DataStax, All Rights Reserved. 16
SSTable SSTable SSTable SSTable
SSTable
Key
Cache
Remove
106,000 objects
Remove
106,000 objects
Remove
106,000 objects
Remove
106,000 objects
Add
424,000 objects
Construct
424,000
objects
Disclaimer: numbers are examples and not representative
Reads of big partitions – on heap
• Primary index data deserialized
• Object structure added to key cache
• Other entries evicted from key cache
• Also applies to compaction & repair
© DataStax, All Rights Reserved. 17
Flushes with big partitions – on heap
• Primary index data constructed
• Object structure added to key cache
(for compactions)
• Also applies to compactions
© DataStax, All Rights Reserved. 18
Trivia
How many 2GB partitions fit in the key cache?
© DataStax, All Rights Reserved. 19
2GB partition  5.6MB
100MB
 100/6 = 16
Disclaimer: numbers are examples and not representative
Issues w/ big partitions – TL;DR
• Amount of Java objects
• Additions and evictions to/from key cache
© DataStax, All Rights Reserved. 20
Necessities – TL;DR
• Reduce amount of Java objects
• Reduce GC pressure
• No change in sstable format
i.e. files need to be binary compatible
© DataStax, All Rights Reserved. 22
Approach
• Omit (most) IndexInfo on heap
• Read IndexInfo only when needed
• Serialize primary index via byte buffer
• Objects “never” promoted to Java old gen
(hope so ;) )
© DataStax, All Rights Reserved. 23
Small heap (3GB) test
© DataStax, All Rights Reserved. 24
Before #11206 – duration: 3h, lots of GC, exhausted heap
With #11206 – duration: 1h10, few GC, moderate heap usage
java.lang.
OutOfMemoryError
org.apache.cassandra.io.sstable.LargePartitionsTest
Results
• Promising!
• But:
Performance regression w/ some workloads
© DataStax, All Rights Reserved. 25
Better Approach
• Keep IndexInfo objects for “nicely” sized
partitions on-heap
• Controlled via c.yaml
© DataStax, All Rights Reserved. 26
Doesn’t this mean more disk I/O?
• “Hot” data already in buffer cache
• No change for “cold” partitions
© DataStax, All Rights Reserved. 27
#11206 Benefits
• Reduced heap usage
• Reduced GC pressure
• Improved read and write paths
• Key cache can hold “more” entries
• Moved the bad partition size “barrier”
© DataStax, All Rights Reserved. 28
#11206 Metrics
org.apache.cassandra.metrics:
type=Index,scope=RowIndexEntry
• name=IndexInfoCount
Histogram - # of IndexInfo per IndexedEntry
• name=IndexInfoGets
Histogram - # of ”gets” against single IndexedEntry
• name=IndexedEntrySize
Histogram - serialized size of IndexedEntry
© DataStax, All Rights Reserved. 29
„After #11206, what‘s the
recommended partition size?“
• It still depends – sorry
• IMO we moved the “barrier”
Test with your
data model
and workload
© DataStax, All Rights Reserved. 30
Bad usage of large partitions
• CQL SELECT without clustering key
• i.e. materialize a large partition in memory
• Using the same partition key over a long time
• i.e. access many sstables
© DataStax, All Rights Reserved. 31
#9754
• Changes on-disk primary index format
• Efficient on-disk representation
• Optimized for OS page size
• WIP !
• Fix-Version: 4.x
© DataStax, All Rights Reserved. 33
Thank You!
Q & A
Come to the “experts stand”

Más contenido relacionado

La actualidad más candente

Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Databricks
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Improving Apache Spark's Reliability with DataSourceV2
Improving Apache Spark's Reliability with DataSourceV2Improving Apache Spark's Reliability with DataSourceV2
Improving Apache Spark's Reliability with DataSourceV2Databricks
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...DataStax
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraQuentin Ambard
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
Getting Started with HBase
Getting Started with HBaseGetting Started with HBase
Getting Started with HBaseCarol McDonald
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponCloudera, Inc.
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Developing Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsDeveloping Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsScyllaDB
 

La actualidad más candente (20)

Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Improving Apache Spark's Reliability with DataSourceV2
Improving Apache Spark's Reliability with DataSourceV2Improving Apache Spark's Reliability with DataSourceV2
Improving Apache Spark's Reliability with DataSourceV2
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & Cassandra
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Getting Started with HBase
Getting Started with HBaseGetting Started with HBase
Getting Started with HBase
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Developing Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsDeveloping Scylla Applications: Practical Tips
Developing Scylla Applications: Practical Tips
 

Similar a Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016

Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
 
Managing Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDBManaging Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDBJason Terpko
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBAntonios Giannopoulos
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in CassandraShogo Hoshii
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonAlluxio, Inc.
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...DataStax
 
Optimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File PruningOptimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File PruningDatabricks
 
Tachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBMTachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBMShaoshan Liu
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Spark Summit
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 

Similar a Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016 (20)

Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
Managing Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDBManaging Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDB
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in Cassandra
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on Tachyon
 
Percona FT / TokuDB
Percona FT / TokuDBPercona FT / TokuDB
Percona FT / TokuDB
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
 
Optimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File PruningOptimising Geospatial Queries with Dynamic File Pruning
Optimising Geospatial Queries with Dynamic File Pruning
 
Tachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBMTachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBM
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 

Más de DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 

Más de DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Último

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Último (20)

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016

  • 1. Myths of big partitions Robert Stupp Solution Architect @ DataStax, C*-Committer @snazy
  • 2. Issues with big partitions before 3.6 • Slow reads • Compaction failures • Repair failures • java.lang.OutOfMemoryError  fail fast  node down (Lot of org.apache.cassandra.io.sstable.IndexInfo on heap) © DataStax, All Rights Reserved. 2
  • 3. SSTable Components © DataStax, All Rights Reserved. 3 Data Primary Index Summary Bloom Filter Determine whether an SSTable contains a partition  bloomFilterFpChance Partition samples  minIndexInterval / maxIndexInterval All partition keys + index samples  column_index_size_in_kb All the data
  • 4. Read from an SSTable © DataStax, All Rights Reserved. 4 Data Primary Index Summary Bloom Filter 1. Check whether partition is in SSTable 2. Find “nearest” partition key 3. Return offset in primary index 4. Find partition 5. Find clustering key 6. Return offset in data file 7. Find, read and return data
  • 6. Evaluation of SSTable Components © DataStax, All Rights Reserved. 6 Data Primary Index Summary Bloom Filter Off-Heap, small  fine Off-Heap, small-ish  fine On-Heap, many small objects, nested structure  problematic For CQL since #8099  fine
  • 7. Primary Index File Layout © DataStax, All Rights Reserved. Partition Index SamplesPartition Key Partition Index SamplesPartition Key es Partition Index SamplesPartition Key Partition Index SPartition Key Samples Partition Index SamplesPartition Key PartitionPartition Key Index Samples Partition Index SamplesPartition Key PPartition Key artition Index Samples Partition Index SamplesPartition Key Partition Key Partition Index Samples ”from” Summary
  • 8. Sampling the Primary Index © DataStax, All Rights Reserved. Partition in Data file Partition Key Offset in SSTable Data File column_index_size_in_kb (default: 64kB) First Key Last Key First Key Last Key First Key Last Key First Key Last Key First Key Last Key First Key Last Key First Key Last Key
  • 9. How it looks on-heap © DataStax, All Rights Reserved. 10 IndexedEntry IndexInfo firstKey, lastKey, offset, width, deletionInfo patitionKey*, offset, deletionInfo * = technically not in IndexedEntry IndexInfo firstKey, lastKey, offset, width, deletionInfo IndexInfo firstKey, lastKey, offset, width, deletionInfo …
  • 10. Primary Index Structure © DataStax, All Rights Reserved. 11 IndexedEntry extends RowIndexEntry DeletionTime ArrayList IndexInfo  per 64kB DeletionTime BufferClustering Kind ByteBuffer[] ByteBuffer byte[] … BufferClustering Kind ByteBuffer[] ByteBuffer byte[] … # of Java objects: IndexedEntry 4 IndexInfo (per 64kB) 8 + 4 * clust-key-components (primitive fields omitted)
  • 11. Primary Index - some numbers © DataStax, All Rights Reserved. 12 Approximation on one 16 byte clustering-value: Partition Size Index Size (heap) # of objects 1MB 3kB > 200 objects 4MB 11kB > 800 objects 64MB 180kB > 13,000 objects 512MB 1.4MB > 106,000 objects 2048MB 5.6MB > 424,000 objects Disclaimer: numbers are examples and not representative
  • 12. Reads • Reads IndexedEntry w/ all IndexInfo • 2GB partition means: 32,768 IndexInfo, 424,000 objects • Binary search just needs: 15 IndexInfo (max), O(log n) ~200 objects © DataStax, All Rights Reserved. 14 Disclaimer: numbers are examples and not representative SELECT foo, bar FROM big_partition_table WHERE ...
  • 13. Writes – Flushes & Compactions IndexedEntry constructed with all IndexInfo as Java object structure on heap first, then serialized to disk © DataStax, All Rights Reserved. 15
  • 14. 106,000 objects 106,000 objects 106,000 objects 106,000 objects Compacting a 2GB partition © DataStax, All Rights Reserved. 16 SSTable SSTable SSTable SSTable SSTable Key Cache Remove 106,000 objects Remove 106,000 objects Remove 106,000 objects Remove 106,000 objects Add 424,000 objects Construct 424,000 objects Disclaimer: numbers are examples and not representative
  • 15. Reads of big partitions – on heap • Primary index data deserialized • Object structure added to key cache • Other entries evicted from key cache • Also applies to compaction & repair © DataStax, All Rights Reserved. 17
  • 16. Flushes with big partitions – on heap • Primary index data constructed • Object structure added to key cache (for compactions) • Also applies to compactions © DataStax, All Rights Reserved. 18
  • 17. Trivia How many 2GB partitions fit in the key cache? © DataStax, All Rights Reserved. 19 2GB partition  5.6MB 100MB  100/6 = 16 Disclaimer: numbers are examples and not representative
  • 18. Issues w/ big partitions – TL;DR • Amount of Java objects • Additions and evictions to/from key cache © DataStax, All Rights Reserved. 20
  • 19.
  • 20. Necessities – TL;DR • Reduce amount of Java objects • Reduce GC pressure • No change in sstable format i.e. files need to be binary compatible © DataStax, All Rights Reserved. 22
  • 21. Approach • Omit (most) IndexInfo on heap • Read IndexInfo only when needed • Serialize primary index via byte buffer • Objects “never” promoted to Java old gen (hope so ;) ) © DataStax, All Rights Reserved. 23
  • 22. Small heap (3GB) test © DataStax, All Rights Reserved. 24 Before #11206 – duration: 3h, lots of GC, exhausted heap With #11206 – duration: 1h10, few GC, moderate heap usage java.lang. OutOfMemoryError org.apache.cassandra.io.sstable.LargePartitionsTest
  • 23. Results • Promising! • But: Performance regression w/ some workloads © DataStax, All Rights Reserved. 25
  • 24. Better Approach • Keep IndexInfo objects for “nicely” sized partitions on-heap • Controlled via c.yaml © DataStax, All Rights Reserved. 26
  • 25. Doesn’t this mean more disk I/O? • “Hot” data already in buffer cache • No change for “cold” partitions © DataStax, All Rights Reserved. 27
  • 26. #11206 Benefits • Reduced heap usage • Reduced GC pressure • Improved read and write paths • Key cache can hold “more” entries • Moved the bad partition size “barrier” © DataStax, All Rights Reserved. 28
  • 27. #11206 Metrics org.apache.cassandra.metrics: type=Index,scope=RowIndexEntry • name=IndexInfoCount Histogram - # of IndexInfo per IndexedEntry • name=IndexInfoGets Histogram - # of ”gets” against single IndexedEntry • name=IndexedEntrySize Histogram - serialized size of IndexedEntry © DataStax, All Rights Reserved. 29
  • 28. „After #11206, what‘s the recommended partition size?“ • It still depends – sorry • IMO we moved the “barrier” Test with your data model and workload © DataStax, All Rights Reserved. 30
  • 29. Bad usage of large partitions • CQL SELECT without clustering key • i.e. materialize a large partition in memory • Using the same partition key over a long time • i.e. access many sstables © DataStax, All Rights Reserved. 31
  • 30.
  • 31. #9754 • Changes on-disk primary index format • Efficient on-disk representation • Optimized for OS page size • WIP ! • Fix-Version: 4.x © DataStax, All Rights Reserved. 33
  • 32. Thank You! Q & A Come to the “experts stand”

Notas del editor

  1. POLL: C* 2.x C* 3.0.x C* < 3.6 C* >= 3.6 Agenda Why big partitions hurt before 3.6 What CASSANDRA-11206 changed Things to consider
  2. Big means many MB - couple 100s and more - like GB Going to explain why this is an issue w/ C* < 3.6
  3. Compression-Info – Digest – Statistics – TOC
  4. Compression-Info – Digest – Statistics – TOC
  5. Compression-Info – Digest – Statistics – TOC
  6. How a primary index file looks like Read "comes" from Summary Scan until requested Partition is found
  7. Emphasize Java Objects
  8. Issues Many objects Promoted between Java heap generations Evicted key cache entries usually in old gen Huge GC pressure Affects Reads Compaction Repair Flush (usually not an issue)
  9. IndexedEntry including all IndexInfo read as Java object structure IndexedEntry including all IndexInfo placed in key cache DONE for ALL kinds of reads
  10. Global - all sstables - all tables - all keyspaces
  11. Partition sizes: 16k, 1M, 4M, 8M … 512M, 1G, 2G, 4G, 8G with varying total data sizes
  12. Nicely sized partitions (few MB)
  13. Updated on reads