This session goes through the understanding of Apache Kafka, its components and working with best practices to achieve fault tolerant system with high availability and consistency by tuning Kafka brokers and producer to achieve the best result.
2. Agenda
● Brief Introduction to Kafka - it’s need?
● Key Terminologies - Zookeeper, Broker, Topic, Partitions,
Offsets, and Replication.
● IT Team and Kafka Cluster Analogy.
● Summarize the Core components and responsibilities.
● Kafka Tuning: Availability & Consistency.
● Delivery Semantics.
● Producer Tuning - Configs. (BONUS)
● DEMO
3. Brief Introduction to Kafka - it’s need?
● Kafka is a horizontally scalable, fault tolerant, and fast messaging system. It’s a pub-
sub model in which various producers and consumers can write and read. It
decouples source and target systems.
Few Use Cases:
● Scale to hundreds of nodes.
● Can handle millions of messages per second.
● Real-time processing (~10ms).
4. Key Terminologies
● ZooKeeper is a centralized service for managing distributed systems. It acts as
ensemble layer (ties things together) and ensures high availability of the Kafka
cluster.
● ZooKeeper stores metadata and the current state of the Kafka cluster. For example,
details like topic name, the number of partitions, replication, leader details of
partition, and In-Sync Replicas are stored in ZooKeeper. (After Kafka 0.10,
consumer offsets are not stored in ZooKeeper).
● Broker is a single Kafka node that is managed by ZooKeeper. A set of brokers form a
Kafka cluster. Topics that are created in Kafka are distributed across brokers based
on the partition, replication, and other factors.
Note: When a broker node fails based on the state stored in ZooKeeper it
automatically rebalances the cluster and if a leader partition is lost then one of the
follower partition (ISR) is elected as the leader.
● Topic is a specific stream of data. It is very similar to a table in a NoSQL database.
Like tables in a NoSQL database, the topic is split into partitions that enable topics
to be distributed across various nodes. Like primary keys in tables, topics have
offsets per partitions. You can uniquely identify a message using its topic,
partition, and offset.
5. Key Terminologies
● Partitions enable topics to be distributed across the cluster. Partitions are a unit of
parallelism for horizontal scalability. One topic can have more than one partition
scaling across nodes.
Messages are assigned to partitions based on partition keys, if there are no partition
keys then the partition is randomly assigned. It’s important to use the correct key
to avoid hotspots.
● Offsets - Each message in a partition is assigned an incremental id called an offset.
Offsets are unique per partition and messages are ordered only within a
partition. Messages written to partitions are immutable.
Note: Messages are not ordered between multiple partitions.
6. Key Terminologies
● Replication is making a copy of a partition available in another broker. Replication
enables Kafka to be fault tolerant. When a partition of the topic is available in
multiple brokers then one of the partitions in a broker is elected as the leader and the
rest of the replications of the partition are followers.
● Replication enables Kafka to be fault tolerant even when a broker is down, the
partition from another broker is elected as a leader and it starts serving the producers
and consumer groups. Replica partitions that are in sync with the leader are flagged
as ISR (In Sync Replica).
8. Summarize the Core components and responsibilities.
● ZooKeeper manages Kafka brokers and their metadata.
● Brokers are horizontally scalable Kafka nodes that contain topics and it's replications.
● Topics are message streams with one or more partitions.
● Partitions contains messages with unique offsets per partition.
● Replication enables Kafka to be fault tolerant using follower partitions (ISRs).
9. Kafka Tuning: Availability & Consistency
● Cluster Size (N): Number of nodes/brokers in the Kafka cluster, we should have
2x+1, i.e. at least 3 nodes or more in an odd number.
● Partitions: Topic is divided into partitions (by default 1), but we should have M times
N, where M can be any integer number, i.e. M >= 1, to achieve more parallelism and
partitioning of data over the cluster. (only if order is not a concern.)
● Replication Factor: determines the number of copies (including the original/Leader)
of each partition in the cluster. All replicas of a partition exist on separate
node/broker, and we should never have R.F. > N, but at least 3.
We recommend having 3 RF with 3 or 5 nodes cluster. This helps in having both
availabilities as well as consistency.
● In-sync Replica (ISR): Number of minimum replicas (including the leader) synced
up, i.e. available for the producer to successfully send messages to the partition.
This inversely impacts the availability for producer i.e. lower the ISR more the
availability and lesser the consistency and vice versa. we should always have ISR
lower than RF. We recommend having 2 ISR for topics with RF as 3.
Note: Setting ISR to 1 is almost equivalent to having no replication in a system.
10. Kafka Tuning: Availability & Consistency
● Acknowledgment: message to be written into the number of replicas before it is
acknowledged to the producer.
a. Setting acks to 0 will make the system to send acknowledgment without writing
the message which may lose the data,
b. setting it to 1 means it should be written at least to the leader replica,
c. and setting it to all means message should be written to all in-sync replica which
helps in consistency but drops the availability.
Note: Setting acks to 0 or 1 can lead to loss of data & inconsistent partitions, in case
of leader failure, the next ISR replica might not be aware of the recent message which will
cause inconsistency in order of events in replicas.
● Unclean Leader Election: in case of failure of all ISR, out-of-sync replica is elected
as Leader, setting this to TRUE is not recommended at all, as it will lose the
consistency of the system, this should be used only and only if we need the 100%
availability irrespective of the consistency.
11. Delivery Semantics
● Acks = 0: At most once delivery semantics. The producer uses “send and forget
approach”. High data loss.
● Acks = 1: At least once delivery semantics. Moderate data loss and Duplicate.
● Acks = ALL: exactly once delivery semantics (acks to all min.insync.replica). No
data loss.
12. Producer Tuning - Configs
● Batch.size – batch size (messages) per request, producer will write to the partition
leader when the batch will be full (even if the linger.ms is not completed.)
● Linger.ms – Time to wait before sending the current batch, if the time is elapsed
even the batch is not full, producer will start sending the events to the Broker.
● Max.in.flight.requests.per.connection - The number of messages to be sent
without any acknowledgment. Default is 5. Set this to 1 to avoid out of order message
due to retry.