SlideShare una empresa de Scribd logo
1 de 48
1
Stream Processing
fundamentals and
Introduction to ksqlDB
Vish Srinivasan
Systems Engineer
2
Agenda
● Motivation for Event Streaming ~5
● Kafka 101 ~10
● Stream Processing ~20
1. Kafka Streams
2. KSQL
● ksqlDB - An introduction ~10
● Q&A ~10
33
Event Streaming Motivation
44
The Central Challenge
Connecting applications with data
ETL
What happened
in the world
Messaging
What is happening in
the world
5
Motivation
6
7
STATEEVENT >
I changed my job from
Snaplogic to Confluent
in April 2019
I work at Confluent.
8
JOB CHANGE RECOMMENDATION ENGINE
SEARCH INDEX
EMAIL SERVICE
9
IS MORE
SOFTWARE
THE USER OF
THE SOFTWARE
1010
Customers Expect Rich
Digital Experiences
● Real-Time combined with historical data
● Only Event Streaming Platforms can do this
When will my driver
get here?
1111
Event-Driven App
(Location Tracking)
Only Real-Time Events
Messaging Queues and
Event Streaming
Platforms can do this
Contextual
Event-Driven App
(ETA)
Real-Time combined
with stored data
Only Event Streaming
Platforms can do this
Where is my driver? When will my driver
get here?
Where is my driver? When will my driver
get here?
2
min
Why Combine Real-time
With Historical Context?
VS.
12
Contextual, Event-Driven Apps
in the Enterprise
“We look at events as running our business. Business people within our
organization want to be able to react to events—and oftentimes it's a
combination of events.”
—Chris D’Agostino, VP of Streaming Data
01
Real-Time
Fraud Notifications
03
Automated
Transaction Analysis
02
Real-Time
“Second Look”
13
Take Away #1
Event Streaming Platforms let
you build Contextual Event
Driven Applications combining
real time and historical data.
14
An Event Streaming Platform
gives you three key functionalities
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
1515
… But first, Kafka Basics
16
Kafka is a Foundation for Event Streams
0 1 2 3 4 5 6 7 8LOG
READS
WRITES
DESTINATION
SYSTEM A
DESTINATION
SYSTEM B
17
BROKER 1 BROKER 2 BROKER 3 BROKER 4
TOPIC 1-PART
1
Storage: Distributed and Replicated
TOPIC 2-PART
2
TOPIC 2-PART
1
TOPIC 1-PART
2
TOPIC 1-PART
1
TOPIC 2-PART
2
TOPIC 2-PART
1
TOPIC 1-PART
2
TOPIC 1-PART
1
TOPIC 2-PART
2
TOPIC 2-PART
1
TOPIC 1-PART
2
2 topics, 2 partitions each, 3 replicas each
PRODUCER
CONSUMER
18
Producing to Kafka
19
Messages will be produced
in a round robin fashion
Written to leader of a
partitions
Producing to Kafka
Time
1
2
3
4
5
20
A
B
C
D
hash(key) %
numPartitions = N
Producing to Kafka with a Key
Time
21
Consuming from Kafka
22
C
Consuming with Single Client
23
C
C
C
C
Consuming with Consumer Groups
Logical Name
Load balanced across all consumers in the group
24
C
CCC
CG1
CC
CG2
Consuming with Consumer Groups
25
Delivery Guarantees
● Producer Guarantees
○ Acks = 0
○ Acks = 1
○ Acks = all
● Consumer Guarantees
○ At least once
○ At most once
○ Exactly once
26
Take Away #2
Kafka lets you
publish/subscribe to events as
well as store events.
27
An Event Streaming Platform
gives you three key functionalities
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
Stream Processing by Analogy
Kafka Cluster
Connect API Stream Processing Connect API
$ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
29
Event Transformation with Stream Processing
streams
The streaming SQL engine for Apache Kafka®
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8;
Apache Kafka® library to write
real-time applications and
microservices in Java and Scala
Confluent KSQL
You write only SQL. No Java, Python, or
other boilerplate to wrap around it!
30
subscribe(), poll(), send(),
flush(), beginTransaction(), …
KStream, KTable, filter(), map(),
flatMap(), join(), aggregate(),
transform(), …
CREATE STREAM, CREATE TABLE,
SELECT, JOIN, GROUP BY, SUM, …
Shoulders of Streaming Giants
KSQL UDFs
31
Processing Layer
(KSQL, KStreams)
31
00100 11101 11000 00011 00100 00110Topic
alice Paris bob Sydney alice RomeStream
plus schema (serdes)
alice Rome
bob Sydney
Table
plus aggregation
Storage Layer
(Brokers)
Topics vs. Streams and Tables
32
“The ledger of Vish’s sales.” “Vish’s sales totals.”
“California sales totals.”
Streams
record history
Tables
represent state
33
1. e4 e5
2. Nf3 Nc6
3. Bc4 Bc5
4. d3 Nf6
5. Nbd2
“The sequence of moves.” “The state of the board.”
Streams
record history
Tables
represent state
34
Another analogy: Behavioral psychology
35
● Processing is partitioned
● Unit of parallelism is stream-task
Streams
topic with schema
Tables
underlying topic (usually) compacted
● Materialized view, cannot be mutated
● Implemented on top of a state-store (mutable)
36
Take Away #3
2 tools to process data: Kafka
Streams and KSQL
2 concepts in both: Streams
and Tables.
3737
… Now, ksqlDB
38
KSQL for Real-Time Monitoring
● Log data monitoring
● Tracking and alerting
● Syslog data
● Sensor / IoT data
● Application metrics
CREATE STREAM syslog_invalid_users AS
SELECT host, message
FROM syslog
WHERE message LIKE '%Invalid user%';
http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
39
KSQL for Anomaly Detection
● Identify patterns or
anomalies in real-
time data, surfaced
in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, COUNT(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING COUNT(*) > 3;
40
KSQL for Streaming ETL
● Joining, filtering, and
aggregating streams
of event data
CREATE STREAM vip_actions AS
SELECT user_id, page, action
FROM clickstream c
LEFT JOIN users u
ON c.user_id = u.user_id
WHERE u.level = 'Platinum';
41
KSQL is a stream processing technology
As such it is not yet a great fit for:
Ad-hoc queries
● No indexes yet in KSQL
● Kafka often configured to retain
data for only a limited span of
time
BI reports (Tableau etc.)
● No indexes yet in KSQL
● No JDBC
● Most BI tools don’t understand
continuous, streaming results
42
PUSH PULL
APP
Jay’s credit score is
670
Jay’s credit score is
710
Jay’s credit score is
695
What is Jay’s credit score now?
695
APP
43
PUSH PULL
SELECT user, credit_score
FROM credit_history
WHERE ROWKEY = ‘jay’
EMIT CHANGES;
SELECT user, credit_score
FROM credit_history
WHERE ROWKEY = ‘jay’;
44
ksqlDB adds two key features to augment KSQL
PULL QUERIES
● Point-in-time lookup of information
● Comparable to a SELECT
statement in a relational database
EMBEDDED CONNECTORS
● Move event data to and from
external data systems
● Available for all supported
connectors
21
APPPULL
$25
How much does Jay’s ride
cost?
CONNECTOR
CONNECTOR
ksqlDB
CONNECTOR
46
So, What use cases is ksqlDB a good fit for?
It does not replace traditional databases:
● What is a database?
● Materialize events into an opinionated structure (table) so you get power of SQL
● When we query, We are querying the state produced by the processor executing the
commit log - we just recreated materialized views.
47
So, What use cases is ksqlDB a good fit for?
ksqlDB is primarily useful for three broad categories of applications:
● Building and serving materialized views that power apps
● Creating real-time streaming apps that react to event streams and trigger side effects
● Creating real-time streaming pipelines that continuously transform event streams
48
Summary Takeaways
● Event Streaming Platforms let you build Contextual Event Driven
Applications combining real time and historical data.
● Kafka lets you publish/subscribe to events and also store them
● Process data with Kafka Streams or KSQL using Streams and Tables
● ksqlDB makes it easy to build and serve materialized views that power apps
49
Thank You!
Reach out if you have any questions:
● Vish Srinivasan - vish@confluent.io
Community Slack: https://launchpass.com/confluentcommunity
Learn Kafka - https://kafka-tutorials.confluent.io/
ksqlDB - https://ksqldb.io/

Más contenido relacionado

La actualidad más candente

APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드confluent
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLScyllaDB
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain confluent
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
 
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...HostedbyConfluent
 
Benefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use CasesBenefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use Casesconfluent
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Diveconfluent
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planningJamieAlquiza
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsAraf Karsh Hamid
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberYing Zheng
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 

La actualidad más candente (20)

APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
 
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
Intelligent Auto-scaling of Kafka Consumers with Workload Prediction | Ming S...
 
Benefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use CasesBenefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use Cases
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 

Similar a Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)

Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshopconfluent
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®confluent
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Eventsconfluent
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexThomas Weise
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Evolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at PinterestEvolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at PinterestHostedbyConfluent
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaminglohitvijayarenu
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshopChristina Lin
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
 
Encode Club workshop slides
Encode Club workshop slidesEncode Club workshop slides
Encode Club workshop slidesVanessa Lošić
 
Bridging the Gap: Connecting AWS and Kafka
Bridging the Gap: Connecting AWS and KafkaBridging the Gap: Connecting AWS and Kafka
Bridging the Gap: Connecting AWS and KafkaPengfei (Jason) Li
 
Apache Pulsar Overview
Apache Pulsar OverviewApache Pulsar Overview
Apache Pulsar OverviewStreamlio
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Michael Noll
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performanceconfluent
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingApache Apex
 

Similar a Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent) (20)

Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshop
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Events
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache Apex
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Evolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at PinterestEvolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at Pinterest
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaming
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
 
Encode Club workshop slides
Encode Club workshop slidesEncode Club workshop slides
Encode Club workshop slides
 
Bridging the Gap: Connecting AWS and Kafka
Bridging the Gap: Connecting AWS and KafkaBridging the Gap: Connecting AWS and Kafka
Bridging the Gap: Connecting AWS and Kafka
 
Apache Pulsar Overview
Apache Pulsar OverviewApache Pulsar Overview
Apache Pulsar Overview
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 

Más de KafkaZone

Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...KafkaZone
 
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)KafkaZone
 
Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)KafkaZone
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)KafkaZone
 
Stream processing at Hotstar
Stream processing at HotstarStream processing at Hotstar
Stream processing at HotstarKafkaZone
 
Data science at scale with Kafka and Flink (Razorpay)
Data science at scale with Kafka and Flink (Razorpay)Data science at scale with Kafka and Flink (Razorpay)
Data science at scale with Kafka and Flink (Razorpay)KafkaZone
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKafkaZone
 

Más de KafkaZone (7)

Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...
 
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
 
Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)Tale of two streaming frameworks (Karthik D - Walmart)
Tale of two streaming frameworks (Karthik D - Walmart)
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
 
Stream processing at Hotstar
Stream processing at HotstarStream processing at Hotstar
Stream processing at Hotstar
 
Data science at scale with Kafka and Flink (Razorpay)
Data science at scale with Kafka and Flink (Razorpay)Data science at scale with Kafka and Flink (Razorpay)
Data science at scale with Kafka and Flink (Razorpay)
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applications
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 

Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)

  • 1. 1 Stream Processing fundamentals and Introduction to ksqlDB Vish Srinivasan Systems Engineer
  • 2. 2 Agenda ● Motivation for Event Streaming ~5 ● Kafka 101 ~10 ● Stream Processing ~20 1. Kafka Streams 2. KSQL ● ksqlDB - An introduction ~10 ● Q&A ~10
  • 4. 44 The Central Challenge Connecting applications with data ETL What happened in the world Messaging What is happening in the world
  • 6. 6
  • 7. 7 STATEEVENT > I changed my job from Snaplogic to Confluent in April 2019 I work at Confluent.
  • 8. 8 JOB CHANGE RECOMMENDATION ENGINE SEARCH INDEX EMAIL SERVICE
  • 9. 9 IS MORE SOFTWARE THE USER OF THE SOFTWARE
  • 10. 1010 Customers Expect Rich Digital Experiences ● Real-Time combined with historical data ● Only Event Streaming Platforms can do this When will my driver get here?
  • 11. 1111 Event-Driven App (Location Tracking) Only Real-Time Events Messaging Queues and Event Streaming Platforms can do this Contextual Event-Driven App (ETA) Real-Time combined with stored data Only Event Streaming Platforms can do this Where is my driver? When will my driver get here? Where is my driver? When will my driver get here? 2 min Why Combine Real-time With Historical Context? VS.
  • 12. 12 Contextual, Event-Driven Apps in the Enterprise “We look at events as running our business. Business people within our organization want to be able to react to events—and oftentimes it's a combination of events.” —Chris D’Agostino, VP of Streaming Data 01 Real-Time Fraud Notifications 03 Automated Transaction Analysis 02 Real-Time “Second Look”
  • 13. 13 Take Away #1 Event Streaming Platforms let you build Contextual Event Driven Applications combining real time and historical data.
  • 14. 14 An Event Streaming Platform gives you three key functionalities Publish & Subscribe to Events Store Events Process & Analyze Events
  • 15. 1515 … But first, Kafka Basics
  • 16. 16 Kafka is a Foundation for Event Streams 0 1 2 3 4 5 6 7 8LOG READS WRITES DESTINATION SYSTEM A DESTINATION SYSTEM B
  • 17. 17 BROKER 1 BROKER 2 BROKER 3 BROKER 4 TOPIC 1-PART 1 Storage: Distributed and Replicated TOPIC 2-PART 2 TOPIC 2-PART 1 TOPIC 1-PART 2 TOPIC 1-PART 1 TOPIC 2-PART 2 TOPIC 2-PART 1 TOPIC 1-PART 2 TOPIC 1-PART 1 TOPIC 2-PART 2 TOPIC 2-PART 1 TOPIC 1-PART 2 2 topics, 2 partitions each, 3 replicas each PRODUCER CONSUMER
  • 19. 19 Messages will be produced in a round robin fashion Written to leader of a partitions Producing to Kafka Time 1 2 3 4 5
  • 20. 20 A B C D hash(key) % numPartitions = N Producing to Kafka with a Key Time
  • 23. 23 C C C C Consuming with Consumer Groups Logical Name Load balanced across all consumers in the group
  • 25. 25 Delivery Guarantees ● Producer Guarantees ○ Acks = 0 ○ Acks = 1 ○ Acks = all ● Consumer Guarantees ○ At least once ○ At most once ○ Exactly once
  • 26. 26 Take Away #2 Kafka lets you publish/subscribe to events as well as store events.
  • 27. 27 An Event Streaming Platform gives you three key functionalities Publish & Subscribe to Events Store Events Process & Analyze Events
  • 28. Stream Processing by Analogy Kafka Cluster Connect API Stream Processing Connect API $ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
  • 29. 29 Event Transformation with Stream Processing streams The streaming SQL engine for Apache Kafka® CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8; Apache Kafka® library to write real-time applications and microservices in Java and Scala Confluent KSQL You write only SQL. No Java, Python, or other boilerplate to wrap around it!
  • 30. 30 subscribe(), poll(), send(), flush(), beginTransaction(), … KStream, KTable, filter(), map(), flatMap(), join(), aggregate(), transform(), … CREATE STREAM, CREATE TABLE, SELECT, JOIN, GROUP BY, SUM, … Shoulders of Streaming Giants KSQL UDFs
  • 31. 31 Processing Layer (KSQL, KStreams) 31 00100 11101 11000 00011 00100 00110Topic alice Paris bob Sydney alice RomeStream plus schema (serdes) alice Rome bob Sydney Table plus aggregation Storage Layer (Brokers) Topics vs. Streams and Tables
  • 32. 32 “The ledger of Vish’s sales.” “Vish’s sales totals.” “California sales totals.” Streams record history Tables represent state
  • 33. 33 1. e4 e5 2. Nf3 Nc6 3. Bc4 Bc5 4. d3 Nf6 5. Nbd2 “The sequence of moves.” “The state of the board.” Streams record history Tables represent state
  • 35. 35 ● Processing is partitioned ● Unit of parallelism is stream-task Streams topic with schema Tables underlying topic (usually) compacted ● Materialized view, cannot be mutated ● Implemented on top of a state-store (mutable)
  • 36. 36 Take Away #3 2 tools to process data: Kafka Streams and KSQL 2 concepts in both: Streams and Tables.
  • 38. 38 KSQL for Real-Time Monitoring ● Log data monitoring ● Tracking and alerting ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  • 39. 39 KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3;
  • 40. 40 KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum';
  • 41. 41 KSQL is a stream processing technology As such it is not yet a great fit for: Ad-hoc queries ● No indexes yet in KSQL ● Kafka often configured to retain data for only a limited span of time BI reports (Tableau etc.) ● No indexes yet in KSQL ● No JDBC ● Most BI tools don’t understand continuous, streaming results
  • 42. 42 PUSH PULL APP Jay’s credit score is 670 Jay’s credit score is 710 Jay’s credit score is 695 What is Jay’s credit score now? 695 APP
  • 43. 43 PUSH PULL SELECT user, credit_score FROM credit_history WHERE ROWKEY = ‘jay’ EMIT CHANGES; SELECT user, credit_score FROM credit_history WHERE ROWKEY = ‘jay’;
  • 44. 44 ksqlDB adds two key features to augment KSQL PULL QUERIES ● Point-in-time lookup of information ● Comparable to a SELECT statement in a relational database EMBEDDED CONNECTORS ● Move event data to and from external data systems ● Available for all supported connectors 21 APPPULL $25 How much does Jay’s ride cost? CONNECTOR CONNECTOR ksqlDB CONNECTOR
  • 45. 46 So, What use cases is ksqlDB a good fit for? It does not replace traditional databases: ● What is a database? ● Materialize events into an opinionated structure (table) so you get power of SQL ● When we query, We are querying the state produced by the processor executing the commit log - we just recreated materialized views.
  • 46. 47 So, What use cases is ksqlDB a good fit for? ksqlDB is primarily useful for three broad categories of applications: ● Building and serving materialized views that power apps ● Creating real-time streaming apps that react to event streams and trigger side effects ● Creating real-time streaming pipelines that continuously transform event streams
  • 47. 48 Summary Takeaways ● Event Streaming Platforms let you build Contextual Event Driven Applications combining real time and historical data. ● Kafka lets you publish/subscribe to events and also store them ● Process data with Kafka Streams or KSQL using Streams and Tables ● ksqlDB makes it easy to build and serve materialized views that power apps
  • 48. 49 Thank You! Reach out if you have any questions: ● Vish Srinivasan - vish@confluent.io Community Slack: https://launchpass.com/confluentcommunity Learn Kafka - https://kafka-tutorials.confluent.io/ ksqlDB - https://ksqldb.io/