Webinar | How to Understand Apache Cassandra™ Performance Through Read/Write Metrics: A Beginner's Guide

© DataStax, All Rights Reserved.Confidential
Understand Apache
Cassandra Performance
Through Metrics:
A Beginner’s Guide
1 © DataStax, All Rights Reserved. Confidential

MAY 21 - 23, 2019
Gaylord National Resort & Convention Center Maryland
Why ?

© DataStax, All Rights Reserved.ConfidentialConfidential © DataStax, All Rights Reserved.
Agenda
● Basic Concepts in Cassandra Architecture
● How Do You Begin To Understand
Performance of a Real-time Database
● What Tools are Available
● What are the Most Important Metrics

MAY 21 - 23, 2019
Cassandra Concepts

Masterless / Peer-to-Peer Architecture
● All nodes are the same, owning a piece of data
● Availability
− No special “master”, “leader”, etc
− No fragility; no single-point-of-failure
− No “failover”
● Scalability
− All nodes host data, but also serve queries
− More data? More nodes.
− More queries? More nodes.
5
Client

Coordinator, Replica and Client
● No single point of failure
● All data replicated
− Replication automatically handled
− All replicas are equal
● Any client can connect to any node
and read/write the data they need
● Any node can be:
−Coordinator
−Storage/Replica Nodes
6
Client

MAY 21 - 23, 2019
Key Concepts in
Real-time Database Performance

Throughput and Latency
● Throughput: rate of operations
● Latency: time takes for one operation
● Sustainable Throughput
− “achieving throughput while safely maintaining
SLA” – Gil Tene
− Don’t measure latency at saturation
● System Resources
− Utilization
− Saturation
− Error
− Availability

How to Measure Latency
● Single latency - Capture the time takes for one operation
● What if you have millions of operations per second?
● What if you have millions in one hour, how do you say “how did the million operations in the
last hour go”?
● How do you effectively plot the latency numbers?

Let’s look at a small example:
● Assume we recorded 12 latency values:
− 11ms, 19ms, 13ms, 12ms, 85ms, 43ms, 720ms, 17ms, 22ms, 25ms, 31ms, 2ms
● If we list out these raw values,
− It will take a lot of space: 12 x 8 bytes = 96 bytes.
− It won’t be scalable: if you have 1 million raw latency values, storage and transfer will be super costly
− It will be very expensive to find max value from the raw list

● Average:
− avg (11, 19, 13, 12, 85, 43, 720, 17, 22, 25, 31, 2) = 83.3ms

● Average:
− avg (11, 19, 13, 12, 85, 43, 720, 17, 22, 25, 31, 2) = 83.3ms
− Downside: no idea about
− the best latency
− the worst latency
− or distribution of these values

Histogram
● A histogram is an accurate representation of
the distribution of numerical data.
● To construct a histogram, the first step is to
"bucket" the range of values
− i.e. divide the entire range of values into a
series of intervals
− and then count how many values fall into
each interval
− The buckets are usually specified as
consecutive, non-overlapping intervals of a
variable
CC BY 2.5,
https://commons.wikimedia.org/w/index.php?curid=3483039

Go back to the previous example:
● We sort them first:
− Then we can put them into the following buckets:
1-10 10-100 100-1000
1 10 1

Go back to the previous example:
● This will save a lot of space and is a lot more scalable
● We’re indeed losing some accuracy:
− Max: 1000ms (actual: 720ms)
− Min: 10ms (actual: 2ms)
− Avg: (10 x 1 + 100 x 10 + 1000 x 1) / 12 = 167ms (actual: 83.3ms)
− We can also calculate percentile, for example:
− 90th Percentile: among 12 latency values, 90% of them occurred in 10-100 bucket or lower
− so P90=100ms
1-10 10-100 100-1000
1 10 1

EstimatedHistogram
● The series starts at 1 and grows by 1.2 each time
1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 24, 29,
…
12108970, 14530764, 17436917, 20924300, 25109160, 30130992, 36157190
● Time resolution from 1 microsecond to 36 seconds

How Histogram Shows Up in Latency Metrics
● Quantile Estimation:
− % of the requests should be
faster than given latency
− P50
− P75
− P95
− P98
− P99
− P999
● Buckets of count/frequency

Aggregation on Histogram
● NO aggregation (e.g. average) on quantile
numbers
● Averaging on Max can be very misleading
● Averaging on quantile number also doesn’t
mean anything

Aggregation on Histogram
● NO aggregation (e.g. average) on quantile
numbers
● Averaging on Max can be very misleading
● Averaging on P90 number also doesn’t
mean anything
● However, if you expose the histogram raw
buckets, merging the number can be
straightforward
1-10 10-100 100-1000
1 10 1
1-10 10-100 100-1000
4 7 1
1-10 10-100 100-1000
2 9 1
node0
node1
node2
1-10 10-100 100-1000
7 26 3
cluster

MAY 21 - 23, 2019
Available Metrics Tools

JMX
Java Management Extensions
● JMX is an API built into Java for managing and monitoring applications
● DataStax Enterprise uses JMX to interact with external applications and
tools
● nodetool leverages JMX to communicate with the database
● Third-party clients can also interact with DSE with JMX

JMX
Accessing JMX
● JMX connects remotely to the IP address of the
node
● Uses the configured JMX port for the JVM
− Default port 7199
− Subsequent RMI connection will also use the same
port
● Also supports user authentication and SSL
encryption

JMX
Accessing JMX
● Third-party tools for accessing JMX:
− GUI: JConsole, VisualVM
− Command-line: jmxterm, jmxsh, nodetool sjk mx (included with DSE)
● Exposed directly via non-JMX protocols:
− Jolokia – exposes via JSON over HTTP
− Dropwizard Metrics Library (built-in) - exposes via HTTP, SLF4J, Graphite, …

JMX
MBeans
● Managed Java object that represents a device, application, or resource
● Exposes an interface that contains the following:
− Set of readable and/or writeable attributes
− Set of invokable operations
● Derive DSE metrics and information
from reading MBean attributes

MBean
Accessing a Managed Bean (MBean)
● The MBean name is structured as follows:
− domain – usually a package name, i.e. org.apache.cassandra.metrics or com.datastax.bdp
− key property list – list of key-value pairs
− Keys generally have a type and a name
● The full name would be domain:[key1]=[value1],[key2]=[value2],...
− Domain and key property list is separated by colons
− Key-value pairs separated by commas
● MBeans may have a set of readable attributes

MBean
Example
org.apache.cassandra.metrics:type=Client,name=connectedNativeClients

Mbean Metric Types: Gauge and Counter
● Gauge provides an instantaneous reading of the metric value
− It has one attribute called value
● Counter is similar, but is used to compare previous readings
− It has one attribute called count
− Where applicable, the count values are cleared when the node starts or restarts
org.apache.cassandra.metrics:type=Table,keyspace=<keyspace>,scope=<Table>,name=PendingCompactions
org.apache.cassandra.metrics:type=Table,keyspace=<keyspace>,scope=<Table>,name=PendingFlushes
org.apache.cassandra.metrics:type=Table,keyspace=<keyspace>,scope=<Table>,name=BytesFlushed

Mbean Metric Type: Histogram
● Histogram includes attributes for min, max, mean, and various value percentiles
− Uses forward decay to make recent values more significant
− Past minute values twice as significant as all previous values

Mbean Metric Type: Histogram
Histogram example
(Histogram)
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=MutationSizeHistogram

Mbean Metric Type: Meter
● Contains a count and measures mean throughput based on the rate unit
● Includes exponentially-weighted moving average throughputs
− One / five / fifteen minute rates
● Mean throughput doesn’t get affected by moving average values
● Values reset at node start or restart

Mbean Metric Type: Meter
Meter example
● 20 compactions completed since server
restart
● Average throughput for 1 compaction is 152
seconds, based on mean rate (since server
restart)
● In the past fifteen minutes, compactions
were completing at an average rate of one
per 7 seconds
org.apache.cassandra.metrics:type=Compaction,name=TotalCompactionsCompleted

Mbean Metric Types: Timer and Latency
● Timer measures the rate that a particular code is called, and also includes the time-cost
histogram
− Attributes include meter (the number of events in the past 1 / 5 / 15 minutes) and histogram
● Latency is a special type that includes a timer, used for tracking latency in microseconds, and
a counter which counts the total latency for all events
− A separate TotalLatency MBean counts the total latency for all events
− Calculates “correct” histograms
● Values reset at node start or restart

Mbean Metric Types: Timer and Latency
Timer and Latency examples
(Latency)
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
(Latency)
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=TotalLatency

DDAC or OSS C* Metrics Tools
● nodetool
● JMX tools
− JConsole, VisualVM, sjkplus, jmxterm, jmxsh
● DropWizard Metrics Library Metrics Reporter https://github.com/addthis/metrics-reporter-config
● Graphite_Exporter https://github.com/prometheus/graphite_exporter
● Prometheus https://prometheus.io/docs/introduction/overview/
● Grafana https://grafana.com/docs/guides/getting_started/
● cassandra_exporter https://github.com/criteo/cassandra_exporter
● cassandra-monitoring https://github.com/soccerties/cassandra-monitoring
● Prometheus jmx_exporter https://github.com/prometheus/jmx_exporter
● Prometheus node_exporter https://github.com/prometheus/node_exporter

DSE Metrics Tools
● DSE Metrics Collector
● DSE Metrics Collector Dashboard https://github.com/datastax/dse-metric-reporter-dashboards
● Prometheus
● Grafana
● Graphite_Exporter
● nodetool
● JMX tools
− JConsole, VisualVM, sjkplus, jmxterm, jmxsh
● OpsCenter
● DropWizard Metrics Library Metrics Reporter
● cassandra_exporter
● cassandra-monitoring
● Prometheus jmx_exporter and node_exporter

DSE Metrics Collector (DSE)
● Part of DSE Server Foundation
● Collects DSE and OS Metrics
● Easily integrated with enterprise monitoring stack
● Introduced in DSE 6.7 (enabled by default), but backported to DSE 6.0.5+ and
DSE 5.1.14+ as well (disabled by default)
● Based on collectd (with local temporary storage) that can export/expose metrics to
different monitoring systems: Prometheus, Graphite, …
● Collectd works as a sub-process spawned by DSE JVM and life cycle managed by
DSE

DSE Metrics Collector Architecture
Grafana Dashboards
Prometheus
Monitoring Server
Customer Landscape
DataStax Enterprise Cluster
DataStax Metrics Collector
Collectd
DSE and OS Metrics
Exporter Plugin

DataStax Enterprise Metrics Dashboard (DSE)
● Freely available from DataStax github repo as an example
https://github.com/datastax/dse-metric-reporter-dashboards
https://docs.datastax.com/en/dse/6.7/dse-
dev/datastax_enterprise/tools/metricsCollector/mcExportMetricsDocker.html
● Built using docker-compose
● Push button setup of a dashboard environment that can be used as your template

MAY 21 - 23, 2019
What Metrics to Monitor

MBeans
Table Metrics
● Contains metrics affecting all tables on the node
• Mbeans used for table-specific metrics
• Similar to metrics provided by nodetool tablestats
org.apache.cassandra.metrics:type=Table
org.apache.cassandra.metrics:type=Table,keyspace=<keyspace>,
scope=<Table>,name=<MetricName>

MBeans
Keyspace Metrics
● Same metric MBeans as the table metrics, aggregated to the keyspace
● Similar to metrics provided by nodetool tablestats
org.apache.cassandra.metrics:type=Keyspace,scope=<Keyspace>,name=<MetricName>

MBeans
ThreadPool Metrics
● Type divides the thread pools into internal, request, and transport
● Same set of MBeans for each thread pool
− Active Tasks
− Pending Tasks
− Completed Tasks
− Total Blocked Tasks
− Currently Blocked Tasks
− Max Pool Size
● Similar to metrics provided by nodetool tpstats
org.apache.cassandra.metrics:type=ThreadPools,scope=<ThreadPoolName>,type=<Type>,name=<MetricName>

MBeans
Client Request Metrics
● Metrics that encapsulate work taking place at the coordinator level
● Request types:
− CASRead
− CASWrite
− RangeSlice
− Read
− Write
− ViewWrite
● Similar to metrics provided by nodetool proxyhistograms
org.apache.cassandra.metrics:type=ClientRequest,scope=<RequestType>,name=<MetricName>

MBeans
Compaction Metrics
● Metrics specific to compaction work
● Attributes
− BytesCompacted
− PendingTasks
− CompletedTasks
− TotalCompactionsCompleted
− PendingTasksByTableName
● Similar to metrics provided by nodetool compactionstats
org.apache.cassandra.metrics:type=Compaction,name=<MetricName>

MBeans
Other database metrics
CQL Metrics org.apache.cassandra.metrics:type=CQL,name=<MetricName>
DroppedMessage Metrics org.apache.cassandra.metrics:type=DroppedMetrics,scope=<Type>,name=<MetricName>
Streaming Metrics org.apache.cassandra.metrics:type=Streaming,scope=<PeerIP>,name=<MetricName>
CommitLog Metrics org.apache.cassandra.metrics:type=CommitLog,name=<MetricName>
Storage Metrics org.apache.cassandra.metrics:type=Storage,name=<MetricName>
Hinted Handoff Metrics org.apache.cassandra.metrics:type=HintedHandoffManager,name=<MetricName>
Hints Service Metrics org.apache.cassandra.metrics:type=HintsService,name=<MetricName>
SSTable Index Metrics org.apache.cassandra.metrics:type=Index,scope=RowIndexEntry,name=<MetricName>
BufferPool Metrics org.apache.cassandra.metrics:type=BufferPool,name=<MetricName>
Client Metrics org.apache.cassandra.metrics:type=Client,name=<MetricName>
Batch Metrics org.apache.cassandra.metrics:type=Batch,name=<MetricName>

MBeans
JVM Metrics
BufferPool jvm.nio:type=BufferPool,name=<direct|mapped>
FileDescriptorRatio java.lang:type=OperatingSystem,name=<OpenFileDescriptorCount|MaxFileDescriptorCount>
GarbageCollector java.lang:type=GarbageCollector,name=<gc_type>
Memory java.lang:type=Memory
MemoryPool java.lang:type=MemoryPool,name=<memory_pool>
http://cassandra.apache.org/doc/latest/operating/metrics.html

MAY 21 - 23, 2019
Most Important Performance Metrics

Most important metrics to monitor
Metric description Threshold
Read and write latencies. Client scope, table scope P99 > 200ms for more than 1 minute
Dropped mutations Value greater than 0
Pending compactions more than 30 for more than 15min
Aborted compactions Value greater than 0
Total timeouts, and timeouts per host - could be a
sign of network problems, etc.
Value greater than 0
Maximal partition size Partition sizes bigger than 100Mb is a
sign of problems with data model

Number of SSTables on host & per table > 500 per individual table (non-LCS)
Blocked allocations of memtable pool Value greater than 0
Total hints on specific node Value greater than 0
Hints replay (failed, succeed, timed out) Value greater than 0 for failed and
timed out
Blocked tasks for compaction executor, memtable
flush writer
Value greater than 0
Cross-data center latency Too high values (> 100ms)
Number of segments waiting on commit High count during last minute, high
99th percentile of time waiting…

Data about Java’s garbage collection Max GC Elapsed (ms) is greater
than 500ms
Pending flushes More or near value of
memtable_flush_writers

MAY 21 - 23, 2019
Resources

Learning Resources
● Official document of Cassandra’s metrics:
http://cassandra.apache.org/doc/latest/operating/metrics.html
● DSE Metrics Collector Documentation: https://docs.datastax.com/en/dse/6.7/dse-
dev/datastax_enterprise/tools/metricsCollector/mcIntroduction.html
● DSE Metrics Dashboard github repo: https://github.com/datastax/dse-metric-reporter-
dashboards
● Prometheus relabeling configuration in DSE Metrics Dashboard: https://tinyurl.com/y4u3y2zf
● Gil Tene’s Latency Tip of The Day: http://latencytipoftheday.blogspot.com/
● Nitsant Wakart’s blog: http://psy-lob-saw.blogspot.com/2016/07/fixing-co-in-cstress.html

https://tinyurl.com/y62r4uw4

MAY 21 - 23, 2019
Q & A

Webinar | How to Understand Apache Cassandra™ Performance Through Read/Write Metrics: A Beginner's Guide

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Webinar | How to Understand Apache Cassandra™ Performance Through Read/Write Metrics: A Beginner's Guide

Similar a Webinar | How to Understand Apache Cassandra™ Performance Through Read/Write Metrics: A Beginner's Guide (20)

Más de DataStax

Más de DataStax (20)

Último

Último (20)

Webinar | How to Understand Apache Cassandra™ Performance Through Read/Write Metrics: A Beginner's Guide