What is Apache Kafka?
• It is an 🙌 open-source 🙌 event streaming platform.
• Simply put, it is a way of moving data between systems.
• (i.e. between applications, and servers)
3 key capabilities for event streaming:
• To publish (write) and subscribe to (read) streams of events, including
continuous import/export of data from other systems.
• To store streams of events durably and reliably for as long as a dev needs.
• To process streams of events as they occur or retrospectively.
What can I use event streaming for?
• To process payments and financial transactions in real-time
• stock exchanges
• To track and monitor cars, trucks, fleets, and shipments in real-time
• automotive industry
• To continuously capture and analyze sensor data from IoT devices/equipment
• wind parks
Event streaming – cont’d…
• To collect and immediately react to customer interactions and orders
• hotel and travel industry
• mobile applications
• To connect, store, and make available data produced by different divisions of
• To serve as the foundation for data platforms, event-driven architectures,
Where is Apache Kafka used?
• Netflix is using Kafka to apply
recommendations in real-time while
you're watching TV shows.
• Uber uses Kafka to gather user, taxi
and trip data in real-time to compute
and forecast demand and pricing in
• LinkedIn uses Kafka to prevent spam in
their platform, collect user
interactions and make better
T is for Topics and Twilight Sparkle
• A particular stream of data.
• A topic is identified by its name.
• Topics are split into partitions.
• Each partition is ordered.
• Each message within a partition gets an
incremental id, called offset.
• Central protagonist of the show
• Most intellectual member of the Mane Six
• Her cutie mark represents her talent
for magic and her love
for books and knowledge.
0 1 2 3 4
0 1 2 3
0 1 2 3 4 5 6
Topic Replication is the process to offer fail-over
capability for a topic.
• If one broker (holds topics and partitions) is
down, the other broker can serve the data.
• This replication factor defines the number of
copies of a topic in a Kafka cluster (made up of
multiple Kafka brokers).
• Kafka stores messages in topics that
are partitioned and replicated across
multiple brokers in a cluster.
B is for Brokers and Babs Seed
• Holds topics and partitions.
• Each broker is identified with its ID (integer).
• Each broker contains certain topic partitions.
• After connecting to any broker, you're
connected to an entire cluster.
• Apple Bloom's cousin from Manehattan.
• Former member of the Cutie Mark
P is for Partitions and Pinkie Pie
• An ordered, immutable record sequence.
• Once data is written to a partition, it can't be
• Data is assigned randomly to a partition, unless a key
• At any one time, only ONE broker can be a leader for
a given partition.
• That leader can receive and serve data for a
• The other brokers will just be passive
replicas and synchronize the data.
• Each partition is going to have one leader, and
multiple ISR (in-sync replica).
• Baker at Sugarcube Corner.
• Toothless pet alligator, Gummy.
• Represents the element of laughter.
P is also for Producers and Pound Cake
• How do we get data in Kafka?
• Producers write data to topics (which are made up of
• Producers automatically know to which broker and
partition to write to.
• Producers Message Keys: producers can choose to
send a key with a message. (i.e. string, number, .etc)
• If a key is sent, all messages for that key will
always go to the same partition.
• A key is sent if you need message ordering for a
specific field (i.e. truck_id).
• Parents are surprised to find out that
Pound Cake is a male Pegasus, and his twin
foal sister (Pumpkin Cake) is a female
unicorn… even though their parents are
both Earth ponies.
C is for Consumers and Princess Celestia
• How do we read data in Kafka?
• They read data from a topic (identified by name).
• Consumers know which broker to read from.
• In case of broker failures, consumers know how to recover.
• Data is read in order within each partition.
• Consumer Groups: consumers read data in consumer
• Each consumer within a group will read directly from
• If you have more consumers than partitions, some will
• Most magical pony.
• Responsible for raising the sun to create
light in Equestria.
• Over 1,000 years old!
Z is for Zookeeper and Zecora
• Keeps track of status of the Kafka cluster nodes.
• Also keeps track of Kafka topics and partitions.
• Currently, Apache Kafka® uses Apache ZooKeeper™ to
store its metadata (i.e. location of partitions, the
configuration of topics).
⚠️ Apache Kafka is removing the Apache ZooKeeper
• In 2019, they outlined a plan to break this
dependency and bring metadata management back
into Kafka itself.
• Female zebra shaman and herbalist.
• Always speaks in rhyme.
Parece que tiene un bloqueador de anuncios ejecutándose. Poniendo SlideShare en la lista blanca de su bloqueador de anuncios, está apoyando a nuestra comunidad de creadores de contenidos.
¿Odia los anuncios?
Hemos actualizado nuestra política de privacidad.
Hemos actualizado su política de privacidad para cumplir con las cambiantes normativas de privacidad internacionales y para ofrecerle información sobre las limitadas formas en las que utilizamos sus datos.
Puede leer los detalles a continuación. Al aceptar, usted acepta la política de privacidad actualizada.