KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka Summit SF 2019

1
KSQL Performance Tuning
For Fun and Profit
Nick Dearden

2
KSQL Performance Tuning Planning
For Fun and Profit
Nick Dearden

77
Anatomy of a KSQL Query
Tuning goals
Performance Factors
What to Monitor
Rules of Thumb

8
● Every KSQL continuous query results in a
Kafka Streams Application
● An Application has a Topology…
● ..which may have sub-topologies…
● ..which are executed on StreamThreads
Apps,
CPUs,
Topologies
and Threads,
Oh My!

10
Topologies, Tasks, & Partitions
• Topologies are divided into sub-topologies at read-write boundaries
- Read-process-write loop
• Within a sub-topology, tasks created for the max input partition count
- If multiple input topics, they are being co-processed, e.g. joins
- Internal topics, such as *-rekey ones, are counted too
• Each task is assigned to at most one StreamThread
- A StreamThread results in at least 3 JVM threads being created
- A StreamThread has its own Consumer and Producer instance

1111
Topologies, Tasks, &
Partitions
Divide a topology into read-
process-write sub-topologies
Thanks to Andy Bryant for the diagram!

12
Can I just explain it ?
● ksql> show queries;
● ksql> explain CSAS_R2_0;

13
ksql> create stream r2 as select stars, user_id, channel
from ratings;
Query ID | Kafka Topic | Query String
----------------------------------------------------------------
----------------------------------------
CSAS_R2_0 | R2 | CREATE STREAM R2 WITH
(KAFKA_TOPIC='R2', PARTITIONS=1, REPLICAS=1) AS SELECT
RATINGS.STARS "STARS",
RATINGS.USER_ID "USER_ID",
RATINGS.CHANNEL "CHANNEL"
FROM RATINGS;
----------------------------------------------------------------
----------------------------------------
For detailed information on a Query run: EXPLAIN <Query ID>;
ksql> show queries;

14
ksql> explain CSAS_R2_0;
Execution plan
--------------
> [ SINK ] | Schema: [ROWKEY STRING KEY, STARS INTEGER, USER_ID INTEGER, CHANNEL STRING]
> [ PROJECT ] | Schema: [ROWKEY STRING KEY, STARS INTEGER, USER_ID INTEGER,
CHANNEL STRING]
> [ SOURCE ] | Schema: [RATINGS.ROWKEY STRING KEY, RATINGS.ROWTIME
BIGINT, RATINGS.ROWKEY STRING, RATINGS.RATING_ID
BIGINT, RATINGS.USER_ID INTEGER, RATINGS.STARS
INTEGER, RATINGS.ROUTE_ID INTEGER,
RATINGS.RATING_TIME BIGINT, RATINGS.CHANNEL
STRING, RATINGS.MESSAGE STRING]

15
Multiple Queries ->
Multiple Topologies

1616
Performance Goals
● Latency ?
● Throughput ?
● Elasticity ?

18
Breaking Rules ?
● Consumer /
producer configs
● Message format
(Avro uses less CPU
than JSON)
● Compression

19
Network
1 Gb/s ~= 100-110 MB/s
Message Size
100 bytes : 1,000,000 / sec
1kb messages : 100,000 / sec
10kb messages : 10,000 / sec

22
State Stores (RocksDB)
Tables - consider key-space cardinality and message-size
Joins - join type, join windows
Aggregates - window sizes, group cardinality

23
Fault-Tolerance, powered by Kafka
Server A:
“I do stateful stream
processing, like tables,
joins, aggregations.”
“streaming
restore” of
A’s local state to BChangelog Topic
“streaming
backup” of
A’s local state
KSQL / Kafka
Streams App
Kafka
A key challenge of distributed stream processing is fault-tolerant state.
State is automatically migrated
in case of server failure
Server B:
“I restore the state and
continue processing
where
server A stopped.”

2424
Some Measurements
● KSQL Servers – i3.xlarge
○ 4 vCPUs
○ 30.5 GB memory
○ “up to 10Gbit network” (experimentally measured at ~ 1.2Gb/s
full-duplex baseline)
○ 200GB EBS SSD
● JVM Settings
○ Heap size 16GB (~50% of RAM, to leave space for state-stores)

25
Test Highlights
• Simple project query
(“speed-of-light”)
• CREATE STREAM foo AS
SELECT * FROM bar;
#
Queries msg/s MB/s
msg
size
CPU
%
MB Mem
Max
2 193k 59.14 320 99.19 18,949
10 189k 57.67 320 99.74 20,101
20 175k 53.43 320 99.68 23,377
50 168k 51.37 320 96.61 28,291
• 4 cores can’t saturate a 1Gb network link in
this test (but larger messages get close)

26
Test Highlights
• Simple project query
(“speed-of-light”)
• CREATE STREAM foo AS
SELECT * FROM bar;
#
Queries
#
Servers msg/s
msg/s/
host MB/s
CPU
%
2 1 193k 193k 59 99
2 3 585k 195k 179 96
2 10 1,855k 185k 567 96
Message throughput scales with server count
(same query, same data, msg-size=300bytes)

27
CREATE STREAM vip_actions AS
SELECT userid, page, action, zipcode
FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';

28
Test Highlights
• Stream-Table join
Stream-table join runs at ~50% throughput of
project query
#
Queries msg/s MB/s
msg
size
CPU
%
MB Mem
Max
2 88k 26 314 99.8 18,022
10 80k 24 314 99.8 19,931

29
Further Results
• A non-windowed aggregate on the same data ran at ~47k msgs/sec
• A windowed aggregate ran at ~24k msgs/sec (varies with window params)
• Re-partitioning can cut these results further

30
Miscellaneous Factors
• UDFs / UDAFs
• Scaling horizontally vs vertically
• Planning for elasticity

31
Take-Aways (1)
• Establish c
• Project and filter queries are cheap and fast
• Joins are slower, aggregates more so
• If select throughput (c) is 100%, then
• Joins run at about 50% of c
• Aggregates run at about 25%
• Windowed aggregates run ~10-15%

32
Take-Aways (2)
• (de)serialization is the most expensive part of any query
• Use Avro message format
• Start with 4 CPU cores for “serious” message volumes
• Use SSD for any state stores (speed > size)

KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka Summit SF 2019

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka Summit SF 2019

Similar a KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka Summit SF 2019 (20)

Más de confluent

Más de confluent (20)

Último

Último (20)

KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka Summit SF 2019