Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019

Hard Truths About
Eventing and Streaming
Dan Rosanova
Group Principle Program Manager
Microsoft Azure Messaging

A brief history of messaging
• Old school messaging (System/360 QTAM and TCAM)
• IBM MQ
• Rabbit MQ
• Service Bus
• ActiveMQ™
• ZeroMQ
• Apache Kafka®
• NATS

What exactly
is a messaging
queue

A simple queue
Sender sends message to queue
Queue ACKs receipt
Receiver connects to queue & retrieves message
Receiver ACKs complete (or other action)

What queues are good
at and why

The queue is the arbiter of truth – which
simplifies many other aspects
EACH READER CAN JUST SAY ‘GIVE
ME THE NEXT’
MESSAGES ARE ACKED /
COMPLETED INDIVIDUALLY
THE QUEUE IS A BUFFER TO
IMPROVE SCALE AND
PERFORMANCE

Messaging and Queues are
about applications more than
they are about data

Competing Consumer: The Server-Side Cursor

The message is the
Unit of Work

But there are some thing
queues aren’t so good at

Very high scale
Eventually competing consumer models break
down

Competing Consumer: not all competition is healthy

Enter the Partitioned
Consumer model

How is a partitioned consumer different than
a queue?
Data
Apache Kafka® implements a partitioned consumer model

There’s
something
else that
resembles
this

There’s
something
else that
resembles
this
RECORDS A STREAM RECODING MOVES
FORWARD ONLY
YOU CAN PLAY THE
TAPE OVER AND OVER
AGAIN
A CASSETTE TAPE
ACTUALLY HAS LEFT
AND RIGHT CHANNELS
WHEN YOU PRESS
RECORD, THEY BOTH
RECORD
BUT THE DATA ON
EACH CHANNEL IS
DIFFERENT
IN KAFKA THESE
CHANNELS ARE
CALLED PARTITIONS

A bit more on
the partition
concept
Partition is essentially
append only
Reads are performed
using a client side curor
Reads are
nondestructive

In a stream the partition is the
Unit of Work
Streams are processed differently from batch data – normal functions cannot
operate on streams as a whole, as they have potentially unlimited data, and
formally, streams are codata (potentially unlimited), not data (which is finite).

Ultimate
example for
streams
By Danielpr85 based on Graphviz source of TuukkaH - Own work, Public Domain,
https://commons.wikimedia.org/w/index.php?curid=687268

What streams
like Kafka are
very good at
Scale
Low cost
Replay
Order

Why partitioned consumer scales so well

Why partitioned consumer scales so well
Maximum Degree of
Parallelism

Low cost
• There are no expensive indexes to maintain
• Because each partition is independent there is
no cross broker coordination necessary (other
than optional replication)
• Client-side cursor avoids the overhead of
traditional message brokers
• Data replication and ACK level is a choice of the
sender

This may lead you to believe you have found Zen

But there are clouds on the horizon

Fan out and routing
• Partitioned streams (like Kafka) don’t offer
server-side filtering
• Every reader must read all the data
• As more readers want the data a network
imbalance develops
• Parse.ly Kafkapocalypse
10MBps
10MBps
10MBps
10MBps
N MBps

Streams are not queues
• The Unit of Work is not an individual message
• This means processing individual messages
gets complicated
• Cursor management becomes a big challenge
• There is no inherent dead letter capability
• People start adding these ‘features’ in and end
up recreating a queue

CAP Theorem
In theoretical computer science the CAP theorem states that
it is impossible for a distributed computer system to
simultaneously provide all three of the following guarantees:
Consistency, Availability, Partition tolerance

What does CAP mean for streams?
Consistency: Data should
produce the same results
when read multiple times –
i.e. it should be stable and
durable
Availability: The place data is
written to should always be
available to write to
Partition tolerance: the
ability to continue
functioning when one part of
the system becomes
separated from another

Or put another way
when a network partition happens, which over
time is inevitable, then you must make a
choice...

This is your last chance.
After this, there is no turning back...
Consistency

You must decide which of these two is most
significant
Consistency Availabilty

Partitioning schemes
Not all keys are created
equal
You need to be careful to
avoid hot keys
It’s not always something
you can avoid

To key or not to key…
that is the question

Adding partitions
You’ve identified a hot
partition
You add more partitions
to handle the scale
The result is a data split
Partition 1
Partition 2
1 2 3 4
5 6 7

Failure to plan is planning to
fail

Strategies for dealing with failures in
messaging and streaming
Stop Drop
Retry Deadletter

Stop
• Simply stop reading – or writing the stream
• Wait until someone elsewhere has fixed the
problem and then resume
• Appropriate for some scenarios, but not all
• Probably a good idea to include a notification

Drop
• If the messages aren’t that important, just drop
them
• Up to a certain point they may not matter
• This is a good strategy for non-mission critical
streams
• But not so good for scenarios requiring strong
consistency guarantees
• Definitely a good idea to include a notification

Retry
• Try again and see if it works
• Perhaps the error is transient
• Be aware of impact on downstream systems -
idempotence

Deadletter
• Put the data somewhere off your hot path so
that you can go back and handle it later
• Does not interrupt your flow
• Works for poisoned messages

Combining strategies
• Often no one strategy will exactly match your
needs
• You can combine these to achieve the policy
that is right for you
• E.G. Retry three times, then deadletter

What are event driven architectures
• Events are notifications that something
happened
• This is different than traditional messages,
which are the thing (the command)
• Event Driven Architectures are reactive in
nature
• State is derived from an event log or stream

Event Sourcing
• Add head
• Add body
• Add left arm
• Add right arm
• Add left leg
• Add right leg

Capabilities we’ve gain from Event Sourcing
• Complete rebuild
• Temporal query
• Event replay

What cool things can you do now?
• Add head
• Add body
• Add left arm
• Add right arm
• Add left leg
• Add right leg

Obvious shortcomings of Event Sourcing and
how to overcome them
TIME TO PROCESS THE LOG:
CHECKPOINTING ON A REGULAR BASIS
HOW TO QUERY THE STATE: BUILDING
A MATERIALIZED VIEW

Event Sourcing leads to divergent models for
read and write
This is often addressed with Command Query Responsibility Separation (CQRS)
Despite these benefits, you should be very cautious about using CQRS. Many
information systems fit well with the notion of an information base that is updated
in the same way that it's read, adding CQRS to such a system can add significant
complexity. I've certainly seen cases where it's made a significant drag on
productivity, adding an unwarranted amount of risk to the project, even in the
hands of a capable team.
-Martin Fowler

KStreams can help you do Event Sourcing
BASICALLY A WAY TO DO EVENT
SOURCING WITHOUT BEING AN
ARCHITECTURAL ASTRONAUT
PROVIDES MATERIALIZED VIEW
(USES ROCKSDB INTERNALLY TO
HOLD THE TABLE)
EACH APPLICATION CAN NOW
HAVE ITS OWN VIEW OF THE
STREAM

Cloud Events
Purpose Definition
https://cloudevents.io/

A specification for describing event
data in common formats to provide
interoperability across services,
platforms and systems.

Why Cloud Events?
THE LACK OF A COMMON WAY OF
DESCRIBING EVENTS MEANS
DEVELOPERS MUST CONSTANTLY RE-
LEARN HOW TO RECEIVE EVENTS.
THIS ALSO LIMITS THE POTENTIAL
FOR LIBRARIES, TOOLING AND
INFRASTRUCTURE TO AIDE THE
DELIVERY OF EVENT DATA ACROSS
ENVIRONMENTS.
THE PORTABILITY AND
PRODUCTIVITY WE CAN ACHIEVE
FROM EVENT DATA IS HINDERED
OVERALL.
CONSISTENCY ACCESSIBILITY PORTABILITY

Sample Cloud Event
• These are the rules for
the envelope
• The data section is
opaque

Combining Events and Streams
•Events can be fed into a stream
•Stream processors can produce their own events
Stream f(x)

Key differences between events and streams
Events as the records, streams as the
communication mechanism

Key differences between events and streams
• Dispatch and how you can do this in Kafka
• Push and other ways to accomplish it
Stream Push Based
Dispatch
Fan In Fan Out

In closing
Pick the right tool for
the job
You may need
multiple tools
Be realistic about
your expectations
Experiment and learn
- continuously
Share your learnings
in contributions,
blogs, etc.
Be an active member
of the Apache Kafka
community!

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019

Similar a Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019 (20)

Más de confluent

Más de confluent (20)

Último

Último (20)

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019