SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Planning your queries
for maximum performance
VP R&D, ScyllaDB
Shlomi Livne
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Shlomi Livne
2
Shlomi is VP of R&D at ScyllaDB. Prior to ScyllaDB
he led the research and development team at
Convergin, which was acquired by Oracle.
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How Scylla executes
your queries
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Cluster View
4
client
Cluster of nodes
1
7
3
4
5
68
2
Coordinator
Replica
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Coordinator Tasks
5
1. Prepare the statement
2. Single partition queries
a. Selects replicas (using cache heat info) - and send query / digest requests
requesting a page of results
b. Compare the digests, if there is a mismatch:
i. Request data from selected replicas
ii. Repair the data on replicas
c. Return result
3. Partition scan queries
a. Split the request up based on the ring
b. Send requests for data using ranges - requesting a page of results
c. Merge results
d. Return result
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Replica Tasks
6
1. Receive a data/digest/range request
2. Split the request up according to shards
3. On each shard:
a. Execute the request merging data from memtables + cache/sstables
b. For data request:
i. prepare a result and return it (compute digest if RF > 1)
c. For digest request:
i. compute digest and return it
d. For partition scan request
i. return the partition range data (do not prepare a result)
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
7
Bloom Filter Summary Index Compression Data
Bloom Filter Summary Index Compression Data
Bloom Filter Summary Index Compression Data
ResultRow CacheMemtable
Read Req Result
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
8
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
9
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
10
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Bloom Filter
emtable
P8:R1:C=3
Replica Shard Read Diagram
11
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
12
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter 12Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
13
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
13
Bloom Filter 13Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
15
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
16
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
17
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Row Cache
18
▪ Cache stores complete row data
▪ In addition to storing existing rows, cache stores information
about completeness of clustering ranges (continuity), so it doesn't
miss between cached rows.
▪ Cache is populated on:
o Queries
o Memtable flush:
• Data is merged - to keep it up to date with new sstables written.
• Data is inserted - in case there is no data for that partition on disk.
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Selecting Sstables
19
▪ Given a partition key (pk), the current set of sstables is reduced so that
sstable X will be included iff:
o min_partition_key(sstable X) < pk < max_partition_key (sstable X)
o bloom_filer (sstable X, pk) = True
▪ Scylla 2.0: SStables will be read in parallel
▪ Scylla 2.1:
o The reduced set of sstables is searched newest to oldest until a result can be
constructed and we can prove that older sstables are not relevant.
o SStables read parallelism will grow starting from a single sstable
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
7 Rules To
Optimize your Queries
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #1 - Use Prepared statements
▪ Coordinator needs to pre-process the query:
o A lot of repetitive work that can be done only once
o Adds overhead in execution of a query - directly translates to throughput and
latency
▪ Driver is not able to send the request to a coordinator node that
holds the data (an additional hop)
▪ tip: compare scylla_query_processor_statements_prepared to the
# of executed scylla_transport_requests_served
21
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Sample: single Scylla server, using c-s
22
Results Unprepared Prepared
op rate 13037 18704
partition rate 13037 18704
row rate 13037 18704
latency mean 1.5 1.1
latency median 1.3 1
latency 95th percentile 2.9 1.6
latency 99th percentile 6.2 2.5
latency 99.9th percentile 12.2 7.1
latency max 31.1 16.9
Total partitions 100000 100000
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #2 - Use Paging
▪ Paging Disabled: Coordinator will be forced to prepare a single
result that holds all the data and send it back:
o If coordinator is not able to return a response (allocate enough memory for
the single result) an error will be returned to the client
o tip: compare scylla_transport_unpaged_queries to scylla_cql_reads to
detected if many of your read queries are unpaged
23
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #3 - Use correct Page Size
▪ Drivers enable paging by default with a default page_size 5000
rows (java, python, gocql)
▪ CQL requires returning at least one result and allows returning less
results than the page size
▪ Scylla utilizes this:
o Scylla caps a page_size to ~1MB of memory - Scylla will return less rows than
requested when rows are large
o Do not use the number of returned results as indication if there are no more
results
24
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
25
21
Has more pages
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla 2.0: does the default page_size make sense
26
page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes
10 timed out 2104.492031 331.087871 173.932543
50 5679.087615 737.148927 202.113023 168.165375
100 4034.920447 573.046783 186.384383 168.951807
500 2663.383039 415.760383 183.894015 173.015039
1000 2451.570687 395.313151 182.976511 168.427519
5000 2285.895679 400.031743 184.942591 169.345023
10000 2281.701375 399.769599 183.369727 169.738239
50000 2273.312767 396.099583 183.107583 170.000383
Test: duration in millisecond fetching a single wide partition with 10^8 bytes
split into rows using different page size
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Test: duration in millisecond fetching a single wide partition with 10^8 bytes
split into rows using different page size
C* 3.11.0: does the default page_size make sense
27
page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes
10 timed out 4030.726143 903.872511 364.380159
50 12876.51328 1535.115263 419.430399 300.941311
100 8992.587775 1202.716671 405.274623 316.407807
500 6400.507903 907.542527 354.680831 348.651519
1000 6077.546495 874.512383 360.972287 370.409471
5000 5620.367359 791.674879 422.051839 358.612991
10000 5490.343935 793.772031 389.021695 360.447999
50000 5662.310399 913.833983 383.516671 355.467263
tip: consider changing the page size if your rows are large
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #4 - Beware of Multi Partition CQL IN queries
▪ Multi-Partition CQL IN queries: force the coordinator node to split
the queries up to single partition queries and aggregate results.
28
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #5 - Beware of Single Partition CQL IN queries
Question: Should I split the CQL IN Query ?
Sample:
▪ CQL: “Select * from ks.cf where pk = X and ck in (Y1, Y2, … Yn)
Translated to:
▪ CQL:
o “Select * from ks.cf where pk = X and ck = Y1“
o “Select * from ks.cf where pk = X and ck = Y2“
.
o “Select * from ks.cf where pk = X and ck = Yn“
29
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
30
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
31
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
32
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
33
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Question: Should I split the CQL IN Query ?
Answer: It depends on how wide your rows are
Comments:
▪ Prior to Scylla-2.0 in some wide partition cases single partition CQL
IN Queries - performed very badly.
▪ All reported results are using Scylla 2.0
34
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #6 - There’s a faster way todo full scans
▪ The blog post efficient-full-table-scans-with-scylla outlaid an
algorithm todo full scans; in highlevel:
o split the range up into small sub ranges
o run “enough” sub ranges in parallel
▪ In follow up blog How to scan 475 million partitions 12x faster
using efficient full table scan a sample implementation applying
this was provided
▪ Is there even a “faster” way ?
35
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
▪ Yes there is:
o Using the token ownership of nodes in the ring one can select ranges of
tokens. Once a “range” has been processed - the next “range” can be
selected based on the ownership in the ring.
o An even more optimized solution would use the “sharding” information and
aim ranges based on shards on a machine - so that all cores are executing
requests in parallel.
36
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #7: Use the tools ….
▪ Probelastic tracing
▪ Slow query tracing
▪ Wireshark
▪ CQL Trace
▪ Enable Client Side tracing.
37
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
THANK YOU
shlomi@scylladb.com
@ShlomiLivne
Any questions?

Más contenido relacionado

La actualidad más candente

Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...ScyllaDB
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScyllaDB
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...ScyllaDB
 
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScyllaDB
 
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScyllaDB
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScyllaDB
 
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScyllaDB
 
Scylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScyllaDB
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScyllaDB
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesScyllaDB
 
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...ScyllaDB
 
Scylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScyllaDB
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScyllaDB
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...ScyllaDB
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScyllaDB
 
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScyllaDB
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScyllaDB
 
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...ScyllaDB
 
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScyllaDB
 
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScyllaDB
 

La actualidad más candente (20)

Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards Scylla
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
 
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
 
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized Views
 
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
 
Scylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on Kubernetes
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined Types
 
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
 
Scylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking ahead
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
 
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
 
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
 
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
 
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
 

Similar a Scylla Summit 2017: Planning Your Queries for Maximum Performance

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0Manyi Lu
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesCloudera, Inc.
 
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftBest Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftAmazon Web Services
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query ExecutionJ Singh
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: butest
 
C:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogC:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogpadblo
 
Oracle tips and tricks
Oracle tips and tricksOracle tips and tricks
Oracle tips and tricksYanli Liu
 
DB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxDB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxNermeenKamel7
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer OverviewOlav Sandstå
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6Mahesh Vallampati
 
ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9Allison Kunz
 
educational course/tutorialoutlet.com
educational course/tutorialoutlet.comeducational course/tutorialoutlet.com
educational course/tutorialoutlet.comjorge0043
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017Guilhem Bichot
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaperoracle documents
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingMark Kerzner
 
lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444227567
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Databricks
 
Sparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersSparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersICTeam S.p.A.
 

Similar a Scylla Summit 2017: Planning Your Queries for Maximum Performance (20)

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
 
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftBest Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web:
 
C:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogC:\nppdf32 log\debuglog
C:\nppdf32 log\debuglog
 
Oracle tips and tricks
Oracle tips and tricksOracle tips and tricks
Oracle tips and tricks
 
Les12[1]Creating Views
Les12[1]Creating ViewsLes12[1]Creating Views
Les12[1]Creating Views
 
DB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxDB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptx
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
 
ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9
 
educational course/tutorialoutlet.com
educational course/tutorialoutlet.comeducational course/tutorialoutlet.com
educational course/tutorialoutlet.com
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaper
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
 
Sparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersSparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R users
 

Más de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Más de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Último (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Scylla Summit 2017: Planning Your Queries for Maximum Performance

  • 1. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Planning your queries for maximum performance VP R&D, ScyllaDB Shlomi Livne
  • 2. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Shlomi Livne 2 Shlomi is VP of R&D at ScyllaDB. Prior to ScyllaDB he led the research and development team at Convergin, which was acquired by Oracle.
  • 3. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company How Scylla executes your queries
  • 4. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Cluster View 4 client Cluster of nodes 1 7 3 4 5 68 2 Coordinator Replica
  • 5. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Coordinator Tasks 5 1. Prepare the statement 2. Single partition queries a. Selects replicas (using cache heat info) - and send query / digest requests requesting a page of results b. Compare the digests, if there is a mismatch: i. Request data from selected replicas ii. Repair the data on replicas c. Return result 3. Partition scan queries a. Split the request up based on the ring b. Send requests for data using ranges - requesting a page of results c. Merge results d. Return result
  • 6. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Replica Tasks 6 1. Receive a data/digest/range request 2. Split the request up according to shards 3. On each shard: a. Execute the request merging data from memtables + cache/sstables b. For data request: i. prepare a result and return it (compute digest if RF > 1) c. For digest request: i. compute digest and return it d. For partition scan request i. return the partition range data (do not prepare a result)
  • 7. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 7 Bloom Filter Summary Index Compression Data Bloom Filter Summary Index Compression Data Bloom Filter Summary Index Compression Data ResultRow CacheMemtable Read Req Result Bloom Filter Summary Index Compression Data
  • 8. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 8 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter Summary Index Compression Data
  • 9. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 9 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter Summary Index Compression Data
  • 10. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 10 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter Summary Index Compression Data
  • 11. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Bloom Filter emtable P8:R1:C=3 Replica Shard Read Diagram 11 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Summary Index Compression Data
  • 12. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 12 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter 12Summary Index Compression Data
  • 13. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 13 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 13 Bloom Filter 13Summary Index Compression Data
  • 14. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter P8 Summary Index Compression Data
  • 15. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 15 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter P8 Summary Index Compression Data
  • 16. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 16 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data
  • 17. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 17 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter P8 Summary Index Compression Data
  • 18. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Row Cache 18 ▪ Cache stores complete row data ▪ In addition to storing existing rows, cache stores information about completeness of clustering ranges (continuity), so it doesn't miss between cached rows. ▪ Cache is populated on: o Queries o Memtable flush: • Data is merged - to keep it up to date with new sstables written. • Data is inserted - in case there is no data for that partition on disk.
  • 19. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Selecting Sstables 19 ▪ Given a partition key (pk), the current set of sstables is reduced so that sstable X will be included iff: o min_partition_key(sstable X) < pk < max_partition_key (sstable X) o bloom_filer (sstable X, pk) = True ▪ Scylla 2.0: SStables will be read in parallel ▪ Scylla 2.1: o The reduced set of sstables is searched newest to oldest until a result can be constructed and we can prove that older sstables are not relevant. o SStables read parallelism will grow starting from a single sstable
  • 20. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 7 Rules To Optimize your Queries
  • 21. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #1 - Use Prepared statements ▪ Coordinator needs to pre-process the query: o A lot of repetitive work that can be done only once o Adds overhead in execution of a query - directly translates to throughput and latency ▪ Driver is not able to send the request to a coordinator node that holds the data (an additional hop) ▪ tip: compare scylla_query_processor_statements_prepared to the # of executed scylla_transport_requests_served 21
  • 22. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Sample: single Scylla server, using c-s 22 Results Unprepared Prepared op rate 13037 18704 partition rate 13037 18704 row rate 13037 18704 latency mean 1.5 1.1 latency median 1.3 1 latency 95th percentile 2.9 1.6 latency 99th percentile 6.2 2.5 latency 99.9th percentile 12.2 7.1 latency max 31.1 16.9 Total partitions 100000 100000
  • 23. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #2 - Use Paging ▪ Paging Disabled: Coordinator will be forced to prepare a single result that holds all the data and send it back: o If coordinator is not able to return a response (allocate enough memory for the single result) an error will be returned to the client o tip: compare scylla_transport_unpaged_queries to scylla_cql_reads to detected if many of your read queries are unpaged 23
  • 24. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #3 - Use correct Page Size ▪ Drivers enable paging by default with a default page_size 5000 rows (java, python, gocql) ▪ CQL requires returning at least one result and allows returning less results than the page size ▪ Scylla utilizes this: o Scylla caps a page_size to ~1MB of memory - Scylla will return less rows than requested when rows are large o Do not use the number of returned results as indication if there are no more results 24
  • 25. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 25 21 Has more pages
  • 26. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Scylla 2.0: does the default page_size make sense 26 page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes 10 timed out 2104.492031 331.087871 173.932543 50 5679.087615 737.148927 202.113023 168.165375 100 4034.920447 573.046783 186.384383 168.951807 500 2663.383039 415.760383 183.894015 173.015039 1000 2451.570687 395.313151 182.976511 168.427519 5000 2285.895679 400.031743 184.942591 169.345023 10000 2281.701375 399.769599 183.369727 169.738239 50000 2273.312767 396.099583 183.107583 170.000383 Test: duration in millisecond fetching a single wide partition with 10^8 bytes split into rows using different page size
  • 27. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Test: duration in millisecond fetching a single wide partition with 10^8 bytes split into rows using different page size C* 3.11.0: does the default page_size make sense 27 page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes 10 timed out 4030.726143 903.872511 364.380159 50 12876.51328 1535.115263 419.430399 300.941311 100 8992.587775 1202.716671 405.274623 316.407807 500 6400.507903 907.542527 354.680831 348.651519 1000 6077.546495 874.512383 360.972287 370.409471 5000 5620.367359 791.674879 422.051839 358.612991 10000 5490.343935 793.772031 389.021695 360.447999 50000 5662.310399 913.833983 383.516671 355.467263 tip: consider changing the page size if your rows are large
  • 28. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #4 - Beware of Multi Partition CQL IN queries ▪ Multi-Partition CQL IN queries: force the coordinator node to split the queries up to single partition queries and aggregate results. 28
  • 29. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #5 - Beware of Single Partition CQL IN queries Question: Should I split the CQL IN Query ? Sample: ▪ CQL: “Select * from ks.cf where pk = X and ck in (Y1, Y2, … Yn) Translated to: ▪ CQL: o “Select * from ks.cf where pk = X and ck = Y1“ o “Select * from ks.cf where pk = X and ck = Y2“ . o “Select * from ks.cf where pk = X and ck = Yn“ 29
  • 30. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 30
  • 31. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 31
  • 32. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 32
  • 33. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 33
  • 34. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Question: Should I split the CQL IN Query ? Answer: It depends on how wide your rows are Comments: ▪ Prior to Scylla-2.0 in some wide partition cases single partition CQL IN Queries - performed very badly. ▪ All reported results are using Scylla 2.0 34
  • 35. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #6 - There’s a faster way todo full scans ▪ The blog post efficient-full-table-scans-with-scylla outlaid an algorithm todo full scans; in highlevel: o split the range up into small sub ranges o run “enough” sub ranges in parallel ▪ In follow up blog How to scan 475 million partitions 12x faster using efficient full table scan a sample implementation applying this was provided ▪ Is there even a “faster” way ? 35
  • 36. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company ▪ Yes there is: o Using the token ownership of nodes in the ring one can select ranges of tokens. Once a “range” has been processed - the next “range” can be selected based on the ownership in the ring. o An even more optimized solution would use the “sharding” information and aim ranges based on shards on a machine - so that all cores are executing requests in parallel. 36
  • 37. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #7: Use the tools …. ▪ Probelastic tracing ▪ Slow query tracing ▪ Wireshark ▪ CQL Trace ▪ Enable Client Side tracing. 37
  • 38. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company THANK YOU shlomi@scylladb.com @ShlomiLivne Any questions?