10. 1010
Customers Expect Rich
Digital Experiences
● Real-Time combined with historical data
● Only Event Streaming Platforms can do this
When will my driver
get here?
11. 1111
Event-Driven App
(Location Tracking)
Only Real-Time Events
Messaging Queues and
Event Streaming
Platforms can do this
Contextual
Event-Driven App
(ETA)
Real-Time combined
with stored data
Only Event Streaming
Platforms can do this
Where is my driver? When will my driver
get here?
Where is my driver? When will my driver
get here?
2
min
Why Combine Real-time
With Historical Context?
VS.
12. 12
Contextual, Event-Driven Apps
in the Enterprise
“We look at events as running our business. Business people within our
organization want to be able to react to events—and oftentimes it's a
combination of events.”
—Chris D’Agostino, VP of Streaming Data
01
Real-Time
Fraud Notifications
03
Automated
Transaction Analysis
02
Real-Time
“Second Look”
13. 13
Take Away #1
Event Streaming Platforms let
you build Contextual Event
Driven Applications combining
real time and historical data.
14. 14
An Event Streaming Platform
gives you three key functionalities
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
25. 25
Delivery Guarantees
● Producer Guarantees
○ Acks = 0
○ Acks = 1
○ Acks = all
● Consumer Guarantees
○ At least once
○ At most once
○ Exactly once
27. 27
An Event Streaming Platform
gives you three key functionalities
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
28. Stream Processing by Analogy
Kafka Cluster
Connect API Stream Processing Connect API
$ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
29. 29
Event Transformation with Stream Processing
streams
The streaming SQL engine for Apache Kafka®
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8;
Apache Kafka® library to write
real-time applications and
microservices in Java and Scala
Confluent KSQL
You write only SQL. No Java, Python, or
other boilerplate to wrap around it!
31. 31
Processing Layer
(KSQL, KStreams)
31
00100 11101 11000 00011 00100 00110Topic
alice Paris bob Sydney alice RomeStream
plus schema (serdes)
alice Rome
bob Sydney
Table
plus aggregation
Storage Layer
(Brokers)
Topics vs. Streams and Tables
32. 32
“The ledger of Vish’s sales.” “Vish’s sales totals.”
“California sales totals.”
Streams
record history
Tables
represent state
33. 33
1. e4 e5
2. Nf3 Nc6
3. Bc4 Bc5
4. d3 Nf6
5. Nbd2
“The sequence of moves.” “The state of the board.”
Streams
record history
Tables
represent state
35. 35
● Processing is partitioned
● Unit of parallelism is stream-task
Streams
topic with schema
Tables
underlying topic (usually) compacted
● Materialized view, cannot be mutated
● Implemented on top of a state-store (mutable)
36. 36
Take Away #3
2 tools to process data: Kafka
Streams and KSQL
2 concepts in both: Streams
and Tables.
38. 38
KSQL for Real-Time Monitoring
● Log data monitoring
● Tracking and alerting
● Syslog data
● Sensor / IoT data
● Application metrics
CREATE STREAM syslog_invalid_users AS
SELECT host, message
FROM syslog
WHERE message LIKE '%Invalid user%';
http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
39. 39
KSQL for Anomaly Detection
● Identify patterns or
anomalies in real-
time data, surfaced
in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, COUNT(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING COUNT(*) > 3;
40. 40
KSQL for Streaming ETL
● Joining, filtering, and
aggregating streams
of event data
CREATE STREAM vip_actions AS
SELECT user_id, page, action
FROM clickstream c
LEFT JOIN users u
ON c.user_id = u.user_id
WHERE u.level = 'Platinum';
41. 41
KSQL is a stream processing technology
As such it is not yet a great fit for:
Ad-hoc queries
● No indexes yet in KSQL
● Kafka often configured to retain
data for only a limited span of
time
BI reports (Tableau etc.)
● No indexes yet in KSQL
● No JDBC
● Most BI tools don’t understand
continuous, streaming results
42. 42
PUSH PULL
APP
Jay’s credit score is
670
Jay’s credit score is
710
Jay’s credit score is
695
What is Jay’s credit score now?
695
APP
43. 43
PUSH PULL
SELECT user, credit_score
FROM credit_history
WHERE ROWKEY = ‘jay’
EMIT CHANGES;
SELECT user, credit_score
FROM credit_history
WHERE ROWKEY = ‘jay’;
44. 44
ksqlDB adds two key features to augment KSQL
PULL QUERIES
● Point-in-time lookup of information
● Comparable to a SELECT
statement in a relational database
EMBEDDED CONNECTORS
● Move event data to and from
external data systems
● Available for all supported
connectors
21
APPPULL
$25
How much does Jay’s ride
cost?
CONNECTOR
CONNECTOR
ksqlDB
CONNECTOR
45. 46
So, What use cases is ksqlDB a good fit for?
It does not replace traditional databases:
● What is a database?
● Materialize events into an opinionated structure (table) so you get power of SQL
● When we query, We are querying the state produced by the processor executing the
commit log - we just recreated materialized views.
46. 47
So, What use cases is ksqlDB a good fit for?
ksqlDB is primarily useful for three broad categories of applications:
● Building and serving materialized views that power apps
● Creating real-time streaming apps that react to event streams and trigger side effects
● Creating real-time streaming pipelines that continuously transform event streams
47. 48
Summary Takeaways
● Event Streaming Platforms let you build Contextual Event Driven
Applications combining real time and historical data.
● Kafka lets you publish/subscribe to events and also store them
● Process data with Kafka Streams or KSQL using Streams and Tables
● ksqlDB makes it easy to build and serve materialized views that power apps
48. 49
Thank You!
Reach out if you have any questions:
● Vish Srinivasan - vish@confluent.io
Community Slack: https://launchpass.com/confluentcommunity
Learn Kafka - https://kafka-tutorials.confluent.io/
ksqlDB - https://ksqldb.io/