SlideShare una empresa de Scribd logo
1 de 58
Descargar para leer sin conexión
11
Introduction to Apache Kafka and Confluent
... and why they matter!
Kafka Meetup - Johannesburg
Tuesday, March 20th 2018
18:00 – 20:00
SSA - Maxwell Office Park, Magwa Cres, Waterfall City, Midrand, 2090 · Midrand
https://www.meetup.com/Johannesburg-Kafka-Meetup/events/248465767/
22
How Organizations Handle Data Flows: a Giant Mess
Data
Warehouse
Hadoop
NoSQL
Oracle
SFDC
Logging
Bloomberg
…any sink/source
Web Custom Apps Microservices Monitoring Analytics
…and more
OLTP
ActiveMQ
App App
Caches
OLTP OLTPAppAppApp
33
Apache Kafka™: A Distributed Streaming Platform
Apache Kafka
Offline Batch (+1 Hour)Near-Real Time (>100s ms)Real Time (0-100 ms)
Data
Warehouse
Hadoop
NoSQL
Oracle
SFDC
Twitter
Bloomberg
…any sink/source …any sink/source
…and more
Web Custom Apps Microservices Monitoring Analytics
44
More than 1
petabyte of
data in Kafka
Over 1.2
trillion
messages per
day
Thousands of
data streams
Source of all
data
warehouse &
Hadoop data
Over 300
billion user-
related events
per day
55
Over 35% of Fortune 500’s are using Apache Kafka™
6 of top 10
Travel
7 of top 10
Global banks
8 of top 10
Insurance
9 of top 10
Telecom
66
Industry Trends… and why Apache Kafka matters!
1. From ‘big data’ (batch) to ‘fast data’ (stream processing)
2. Internet of Things (IoT) and sensor data
3. Microservices and asynchronous communication (coordination
messages and data streams) between loosely coupled and fine-
grained services
77
Apache Kafka APIs – A UNIX Analogy
$ cat < in.txt | grep "apache" | tr a-z A-Z > out.txt
Connect APIs
Streams APIs
Producer / Consumer APIs
88
Apache Kafka API – ETL Analogy
Source SinkConnectAPI
ConnectAPI
Streams API
Extract Transform Load
99
Apache Kafka 101
Internals and Core Concepts
1010
Apache Kafka Concepts: Persistent Log
Data Producer
0 1 2 3 4 5 6 7 8 9 10 11 12
writes
Data Consumer
(offset = 7)
Data Consumer
(offset = 11)
reads reads
1111
Apache Kafka Concepts: Anatomy of a Topic
0 1 2 3 4 5 6 7 8 9 10 11 12partition 0
0 1 2 3 4 5 6 7
40 1 2 3 5
partition 1
partition 2
writes
1212
Apache Kafka Concepts: Log Storage
offset index
timestamp index
offsets: 0 - 10000
offset index
timestamp index
offsets: 10001 - 20000
offset index
timestamp index
offsets: 20001 - 30000
1313
Apache Kafka Concepts: Message Format
8 bytes 4 bytes 4 bytes 8 bytes 4 bytes varies 4 bytes varies
offset length CRC timesta
mp
key
length
value
length
key
content
value
content
magic
byte
1 byte
attribute
1 byte
1414
Apache Kafka Concepts: Producers and Consumers
Producer
Producer
Producer
Consumer
Consumer
Broker
Broker
Broker
1515
Apache Kafka Concepts: Topics and Partitions
Producer
Producer
Producer
Consumer
Consumer
Broker
Broker
Broker
T0: P0
T0: P2
T0: P1
T0: P3
T1: P0
T1: P1
1616
Apache Kafka Concepts: Fault Tolerance and Replication
Producer
Producer
Producer
Consumer
Consumer
Broker
Broker
Broker
T0: P0
T0: P0 (Replica 1)
T1: P0
T1: P0 (Replica 1)
1717
Apache Kafka Concepts: Consumer Groups
Producer
Producer
Producer
Consumer
Broker
Broker
Broker
T0: P0
T0: P2
T0: P1
T0: P3
T1: P0
T1: P1
Consumer
Consumer
Consumer
1818
The Connect API of Apache Kafka®
 Centralized management and configuration
 Support for hundreds of technologies
including RDBMS, Elasticsearch, HDFS, S3
 Supports CDC ingest of events from RDBMS
 Preserves data schema
 Fault tolerant and automatically load balanced
 Extensible API
 Single Message Transforms
 Part of Apache Kafka, included in
Confluent Open Source
Reliable and scalable integration of Kafka
with other systems – no coding required.
{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://localhost:3306/demo?user=rmoff&password=foo",
"table.whitelist": "sales,orders,customers"
}
https://docs.confluent.io/current/connect/
1919
Build Applications, not Clusters
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>1.0.0</version>
</dependency>
2020
Spot the Difference(s)!
2121
How do I run in production?
2222
How do I run in production?
Uncool Cool
2323
How do I run in production?
http://docs.confluent.io/current/streams/introduction.html
2424
Elastic and Scalable
http://docs.confluent.io/current/streams/developer-guide.html#elastic-scaling-of-your-application
2525
Elastic and Scalable
http://docs.confluent.io/current/streams/developer-guide.html#elastic-scaling-of-your-application
2626
Elastic and Scalable
http://docs.confluent.io/current/streams/developer-guide.html#elastic-scaling-of-your-application
2727
Typical High Level Architecture
Real-time
Data
Ingestion
2828
Typical High Level Architecture
Stream
Processing
Real-time
Data
Ingestion
2929
Typical High Level Architecture
Stream
Processing
Storage
Real-time
Data
Ingestion
3030
Typical High Level Architecture
Data Publishing /
Visualization
Stream
Processing
Storage
Real-time
Data
Ingestion
3131
How many clusters do you count?
NoSQL (Cassandra,
HBase, Couchbase,
MongoDB, …) or
Elasticsearch, Solr,
…
Storm, Flink, Spark
Streaming, Ignite,
Akka Streams, Apex,
…
HDFS, NFS, Ceph,
GlusterFS, Lustre,
...
Apache Kafka
3232
Simplicity is the Ultimate Sophistication
Node.js
Apache Kafka
Distributed Streaming Platform
Publish & Subscribe
to streams of data like a
messaging system
Store
streams of data safely in a
distributed replicated cluster
Process
streams of data efficiently
and in real-time
3333
Duality of Streams and Tables
http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
3434
Duality of Streams and Tables
http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
3535
Interactive Queries
http://docs.confluent.io/current/streams/developer-guide.html#streams-developer-guide-interactive-queries
3636
Interactive Queries
http://docs.confluent.io/current/streams/developer-guide.html#streams-developer-guide-interactive-queries
3737
Kafka Streams DSL
http://docs.confluent.io/current/streams/developer-guide.html#kafka-streams-dsl
3838
WordCount (and Java 8+)
WordCountLambdaExample.java
final Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-lambda-example");
streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
...
final Serde<String> stringSerde = Serdes.String();
final Serde<Long> longSerde = Serdes.Long();
final KStreamBuilder builder = new KStreamBuilder();
final KStream<String, String> textLines = builder.stream(stringSerde, stringSerde,
"TextLinesTopic");
final Pattern pattern = Pattern.compile("W+", Pattern.UNICODE_CHARACTER_CLASS);
final KTable<String, Long> wordCounts = textLines
.flatMapValues(value -> Arrays.asList(pattern.split(value.toLowerCase())))
.groupBy((key, word) -> word)
.count("Counts");
wordCounts.to(stringSerde, longSerde, "WordsWithCountsTopic");
final KafkaStreams streams = new KafkaStreams(builder, streamsConfiguration);
streams.cleanUp();
streams.start();
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
3939
Easy to Develop with, Easy to Test
WordCountLambdaIntegrationTest.java
EmbeddedSingleNodeKafkaCluster CLUSTER = new EmbeddedSingleNodeKafkaCluster();
...
CLUSTER.createTopic(inputTopic);
...
Properties producerConfig = new Properties();
producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
CLUSTER.bootstrapServers());
4040
The Streams API of Apache Kafka®
 No separate processing cluster required
 Develop on Mac, Linux, Windows
 Deploy to containers, VMs, bare metal, cloud
 Powered by Kafka: elastic, scalable,
distributed, battle-tested
 Perfect for small, medium, large use cases
 Fully integrated with Kafka security
 Exactly-once processing semantics
 Part of Apache Kafka, included in
Confluent Open Source
Write standard Java applications and microservices
to process your data in real-time
KStream<User, PageViewEvent> pageViews = builder.stream("pageviews-topic");
KTable<Windowed<User>, Long> viewsPerUserSession = pageViews
.groupByKey()
.count(SessionWindows.with(TimeUnit.MINUTES.toMillis(5)), "session-views");
https://docs.confluent.io/current/streams/
4141
KSQL: a Streaming SQL Engine for Apache Kafka® from Confluent
 No coding required, all you need is SQL
 No separate processing cluster required
 Powered by Kafka: elastic, scalable,
distributed, battle-tested
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
CREATE STREAM vip_actions AS
SELECT userid, page, action
FROM clickstream c
LEFT JOIN users u
ON c.userid = u.userid
WHERE u.level = 'Platinum';
KSQL is the simplest way to process streams of data in real-time
 Perfect for streaming ETL, anomaly detection,
event monitoring, and more
 Part of Confluent Open Source
https://github.com/confluentinc/ksql
Do you think that’s a
table you are querying ?
4343
KSQL in less than 5 minutes
https://www.youtube.com/watch?v=A45uRzJiv7I
4444
Confluent Enterprise: Logical Architecture
Kafka Cluster
Mainframe
Kafka Connect Servers
Kafka ConnectRDBMS
Hadoop
Cassandra
Elasticsearch
Kafka Connect Servers
Kafka Connect
Files
Producer
Application
Consumer
ApplicationZookeeper
Kafka Broker
REST Proxy Servers
REST Proxy
REST Client
Control Center Servers
Control Center
Schema Registry Servers
Schema Registry
Kafka Producer APIs Kafka Consumer APIs
Stream Processing Application 1
Stream Client
Stream Processing Application 2
Stream Client
REST Proxy Servers
REST Proxy
REST Client
4545
Confluent Enterprise: Physical Architecture
Rack 1
Kafka Broker #1
ToR Switch
ToR Switch
Schema Registry #1
Kafka Connect #1
Zookeeper #1
REST Proxy #1
Kafka Broker #4
Zookeeper #4
Rack 2
Kafka Broker #2
ToR Switch
ToR Switch
Schema Registry #2
Kafka Connect #2
Zookeeper #2
Kafka Broker #5
Zookeeper #5
Rack 3
Kafka Broker #3
ToR Switch
ToR Switch
Kafka Connect #3
Zookeeper #3
Core Switch Core Switch
REST Proxy #2
Load Balancer Load Balancer
Control Center #1 Control Center #2
4646
Confluent Completes Kafka
Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise
Apache Kafka
High throughput, low latency, high availability, secure distributed streaming
platform
Kafka Connect API Advanced API for connecting external sources/destinations into Kafka
Kafka Streams API
Simple library that enables streaming application development within the
Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, .NET and several others
REST Proxy
Provides universal access to Kafka from any network connected device via
HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is always
consumable
Pre-Built Connectors
HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified
and supported by Confluent
JMS Client
Support for legacy Java Message Service (JMS) applications consuming
and producing directly from Kafka
Confluent Control
Center
Enables easy connector management, monitoring and alerting for a Kafka
cluster
Auto Data Balancer Rebalancing data across cluster to remove bottlenecks
Replicator Multi-datacenter replication simplifies and automates MDC Kafka clusters
Support
Enterprise class support to keep your Kafka environment running at top
performance Community Community 24x7x365
4747
Big Data and Fast Data Ecosystems
Synchronous Req/Response
0 – 100s ms
Near Real Time
> 100s ms
Offline Batch
> 1 hour
Apache Kafka
Stream Data Platform
Search
RDBMS
Apps Monitoring
Real-time
Analytics
NoSQL
Stream
Processing
Apache Hadoop
Data Lake
Impala
DWH
Hive
Spark Map-Reduce
Confluent HDFS Connector
(exactly once semantics)
https://www.confluent.io/blog/the-value-of-apache-kafka-in-big-data-ecosystem/
4848
Building a Microservices Ecosystem with Kafka Streams and KSQL
https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/
https://github.com/confluentinc/kafka-streams-examples/tree/3.3.0-post/src/main/java/io/confluent/examples/streams/microservices
4949
Microservices: References
Blog posts series:
Part 1: The Data Dichotomy: Rethinking the Way We Treat Data and Services
https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/
Part 2: Build Services on a Backbone of Events
https://www.confluent.io/blog/build-services-backbone-events/
Part 3: Using Apache Kafka as a Scalable, Event-Driven Backbone for Service Architectures
https://www.confluent.io/blog/apache-kafka-for-service-architectures/
Part 4: Chain Services with Exactly Once Guarantees
https://www.confluent.io/blog/chain-services-exactly-guarantees/
Part 5: Messaging as the Single Source of Truth
https://www.confluent.io/blog/messaging-single-source-truth/
Part 6: Leveraging the Power of a Database Unbundled
https://www.confluent.io/blog/leveraging-power-database-unbundled/
Part 7: Building a Microservices Ecosystem with Kafka Streams and KSQL
https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/
Whitepaper:
Microservices in the Apache Kafka™ Ecosystem
https://www.confluent.io/resources/microservices-in-the-apache-kafka-ecosystem/
5050
Apache Kafka Security
Security
• Processes customer data
• Regulatory requirements
• Legal compliance
• Internal security policies
• Need is not limited to
industries such as finance,
healthcare, or governmental
services
Authentication
• Scenario example: “Only certain applications may talk to the production
Kafka cluster”
• Client authentication via SASL – e.g. Kerberos, Active Directory
Authorization
• Scenario example: “Only certain applications may read data from
sensitive Kafka topics”
• Restrict who can create, write to, read from topics, and more
Encryption
• Scenario example: “Data-in-transit between apps and Kafka clusters
must be encrypted”
• SSL supported
• Encrypts data exchanged between Kafka brokers, between Kafka brokers
and Kafka clients/apps
Help meeting security requirements by supporting:
5151
Enterprise Ready Multi-Datacenter Replication for Kafka
Data Center in USA
Kafka Cluster (USA)
Kafka Broker 1
Kafka Broker 2
Kafka Broker 3
ZooKeeper 1
ZooKeeper 2
ZooKeeper 3
Control Center
Kafka Connect
Cluster
Replicator 1
Replicator 2
Data Center in EMEA
Kafka Cluster (EU)
Kafka Broker 1
Kafka Broker 2
Kafka Broker 3
ZooKeeper 1
ZooKeeper 2
ZooKeeper 3
Control Center
Kafka Connect
Cluster
Replicator 1
Replicator 2
Available only with Confluent Enterprise
Apache Kafka and Confluent Open Source
5252
Cloud Synchronization and Migrations with Confluent Enterprise: Before
DC1
DB2
DB1
DWH
App2
App3
App4
KV2KV3
DB3
App2-v2
App5
App7
App1-v2
AWS
App8
DWH
App1
Challenges
• Each team/department
must execute their own cloud
migration
• May be moving the same data
multiple times
• Each box represented here
require development, testing,
deployment, monitoring and
maintenance
KV
5353
DC1
Cloud Synchronization and Migrations with Confluent Enterprise: After
DB2
DB1
KV
DWH
App2
App4
KV2KV3
App2-v2
App5 App7
App1-v2
AWS
App8
DWH
App1
Kafka
Kafka
App3
Benefits
• Continuous low-latency
synchronization
• Centralized manageability and
monitoring
– Track at event level data
produced in all data centers
• Security and governance
– Track and control where data
comes from and who is
accessing it
• Cost Savings
– Move Data Once
DB3
5454
About Confluent and Apache Kafka™
70% of active Kafka
Committers
Founded
September 2014
Technology developed
while at LinkedIn
Founded by the creators of
Apache Kafka
5555
Apache Kafka: PMC members and committers
https://kafka.apache.org/committers
PMC
PMC PMC PMCPMC PMC PMC PMC
PMC PMC PMC
5656
Download Confluent Platform: the easiest way to get you started
https://www.confluent.io/download/
5757
Books: get them all three in PDF format from Confluent website!
https://www.confluent.io/apache-kafka-stream-processing-book-bundle
5858
Discount code: kacom17
Presented by
https://kafka-summit.org/
Presented by

Más contenido relacionado

La actualidad más candente

Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - KafkaMayank Bansal
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developersconfluent
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connectconfluent
 
Consumer offset management in Kafka
Consumer offset management in KafkaConsumer offset management in Kafka
Consumer offset management in KafkaJoel Koshy
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 

La actualidad más candente (20)

kafka
kafkakafka
kafka
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Consumer offset management in Kafka
Consumer offset management in KafkaConsumer offset management in Kafka
Consumer offset management in Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 

Similar a Introduction to apache kafka, confluent and why they matter

Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterconfluent
 
Introduction to Apache Kafka and Confluent... and why they matter!
Introduction to Apache Kafka and Confluent... and why they matter!Introduction to Apache Kafka and Confluent... and why they matter!
Introduction to Apache Kafka and Confluent... and why they matter!Paolo Castagna
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLNick Dearden
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridPaolo Castagna
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentKinetica
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemFlorent Ramiere
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...DataStax Academy
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisHelena Edelson
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson
 
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightIngestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightMicrosoft Tech Community
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Kai Wähner
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...confluent
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAAndrew Morgan
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBMongoDB
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Data analytics at scale implementing stateful stream processing - publish
Data analytics at scale implementing stateful stream processing - publishData analytics at scale implementing stateful stream processing - publish
Data analytics at scale implementing stateful stream processing - publishCodeValue
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Paolo Castagna
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyKairo Tavares
 

Similar a Introduction to apache kafka, confluent and why they matter (20)

Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 
Introduction to Apache Kafka and Confluent... and why they matter!
Introduction to Apache Kafka and Confluent... and why they matter!Introduction to Apache Kafka and Confluent... and why they matter!
Introduction to Apache Kafka and Confluent... and why they matter!
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQL
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka Ecosystem
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightIngestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDB
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Data analytics at scale implementing stateful stream processing - publish
Data analytics at scale implementing stateful stream processing - publishData analytics at scale implementing stateful stream processing - publish
Data analytics at scale implementing stateful stream processing - publish
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
 

Más de Paolo Castagna

Message Driven and Event Sourcing
Message Driven and Event SourcingMessage Driven and Event Sourcing
Message Driven and Event SourcingPaolo Castagna
 
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Paolo Castagna
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformPaolo Castagna
 
Apache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformApache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformPaolo Castagna
 

Más de Paolo Castagna (6)

Message Driven and Event Sourcing
Message Driven and Event SourcingMessage Driven and Event Sourcing
Message Driven and Event Sourcing
 
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platform
 
IoT Data Platforms
IoT Data PlatformsIoT Data Platforms
IoT Data Platforms
 
Confluent and Elastic
Confluent and ElasticConfluent and Elastic
Confluent and Elastic
 
Apache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformApache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming Platform
 

Último

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Último (20)

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Introduction to apache kafka, confluent and why they matter

  • 1. 11 Introduction to Apache Kafka and Confluent ... and why they matter! Kafka Meetup - Johannesburg Tuesday, March 20th 2018 18:00 – 20:00 SSA - Maxwell Office Park, Magwa Cres, Waterfall City, Midrand, 2090 · Midrand https://www.meetup.com/Johannesburg-Kafka-Meetup/events/248465767/
  • 2. 22 How Organizations Handle Data Flows: a Giant Mess Data Warehouse Hadoop NoSQL Oracle SFDC Logging Bloomberg …any sink/source Web Custom Apps Microservices Monitoring Analytics …and more OLTP ActiveMQ App App Caches OLTP OLTPAppAppApp
  • 3. 33 Apache Kafka™: A Distributed Streaming Platform Apache Kafka Offline Batch (+1 Hour)Near-Real Time (>100s ms)Real Time (0-100 ms) Data Warehouse Hadoop NoSQL Oracle SFDC Twitter Bloomberg …any sink/source …any sink/source …and more Web Custom Apps Microservices Monitoring Analytics
  • 4. 44 More than 1 petabyte of data in Kafka Over 1.2 trillion messages per day Thousands of data streams Source of all data warehouse & Hadoop data Over 300 billion user- related events per day
  • 5. 55 Over 35% of Fortune 500’s are using Apache Kafka™ 6 of top 10 Travel 7 of top 10 Global banks 8 of top 10 Insurance 9 of top 10 Telecom
  • 6. 66 Industry Trends… and why Apache Kafka matters! 1. From ‘big data’ (batch) to ‘fast data’ (stream processing) 2. Internet of Things (IoT) and sensor data 3. Microservices and asynchronous communication (coordination messages and data streams) between loosely coupled and fine- grained services
  • 7. 77 Apache Kafka APIs – A UNIX Analogy $ cat < in.txt | grep "apache" | tr a-z A-Z > out.txt Connect APIs Streams APIs Producer / Consumer APIs
  • 8. 88 Apache Kafka API – ETL Analogy Source SinkConnectAPI ConnectAPI Streams API Extract Transform Load
  • 9. 99 Apache Kafka 101 Internals and Core Concepts
  • 10. 1010 Apache Kafka Concepts: Persistent Log Data Producer 0 1 2 3 4 5 6 7 8 9 10 11 12 writes Data Consumer (offset = 7) Data Consumer (offset = 11) reads reads
  • 11. 1111 Apache Kafka Concepts: Anatomy of a Topic 0 1 2 3 4 5 6 7 8 9 10 11 12partition 0 0 1 2 3 4 5 6 7 40 1 2 3 5 partition 1 partition 2 writes
  • 12. 1212 Apache Kafka Concepts: Log Storage offset index timestamp index offsets: 0 - 10000 offset index timestamp index offsets: 10001 - 20000 offset index timestamp index offsets: 20001 - 30000
  • 13. 1313 Apache Kafka Concepts: Message Format 8 bytes 4 bytes 4 bytes 8 bytes 4 bytes varies 4 bytes varies offset length CRC timesta mp key length value length key content value content magic byte 1 byte attribute 1 byte
  • 14. 1414 Apache Kafka Concepts: Producers and Consumers Producer Producer Producer Consumer Consumer Broker Broker Broker
  • 15. 1515 Apache Kafka Concepts: Topics and Partitions Producer Producer Producer Consumer Consumer Broker Broker Broker T0: P0 T0: P2 T0: P1 T0: P3 T1: P0 T1: P1
  • 16. 1616 Apache Kafka Concepts: Fault Tolerance and Replication Producer Producer Producer Consumer Consumer Broker Broker Broker T0: P0 T0: P0 (Replica 1) T1: P0 T1: P0 (Replica 1)
  • 17. 1717 Apache Kafka Concepts: Consumer Groups Producer Producer Producer Consumer Broker Broker Broker T0: P0 T0: P2 T0: P1 T0: P3 T1: P0 T1: P1 Consumer Consumer Consumer
  • 18. 1818 The Connect API of Apache Kafka®  Centralized management and configuration  Support for hundreds of technologies including RDBMS, Elasticsearch, HDFS, S3  Supports CDC ingest of events from RDBMS  Preserves data schema  Fault tolerant and automatically load balanced  Extensible API  Single Message Transforms  Part of Apache Kafka, included in Confluent Open Source Reliable and scalable integration of Kafka with other systems – no coding required. { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "connection.url": "jdbc:mysql://localhost:3306/demo?user=rmoff&password=foo", "table.whitelist": "sales,orders,customers" } https://docs.confluent.io/current/connect/
  • 19. 1919 Build Applications, not Clusters <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>1.0.0</version> </dependency>
  • 21. 2121 How do I run in production?
  • 22. 2222 How do I run in production? Uncool Cool
  • 23. 2323 How do I run in production? http://docs.confluent.io/current/streams/introduction.html
  • 27. 2727 Typical High Level Architecture Real-time Data Ingestion
  • 28. 2828 Typical High Level Architecture Stream Processing Real-time Data Ingestion
  • 29. 2929 Typical High Level Architecture Stream Processing Storage Real-time Data Ingestion
  • 30. 3030 Typical High Level Architecture Data Publishing / Visualization Stream Processing Storage Real-time Data Ingestion
  • 31. 3131 How many clusters do you count? NoSQL (Cassandra, HBase, Couchbase, MongoDB, …) or Elasticsearch, Solr, … Storm, Flink, Spark Streaming, Ignite, Akka Streams, Apex, … HDFS, NFS, Ceph, GlusterFS, Lustre, ... Apache Kafka
  • 32. 3232 Simplicity is the Ultimate Sophistication Node.js Apache Kafka Distributed Streaming Platform Publish & Subscribe to streams of data like a messaging system Store streams of data safely in a distributed replicated cluster Process streams of data efficiently and in real-time
  • 33. 3333 Duality of Streams and Tables http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
  • 34. 3434 Duality of Streams and Tables http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
  • 38. 3838 WordCount (and Java 8+) WordCountLambdaExample.java final Properties streamsConfiguration = new Properties(); streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-lambda-example"); streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers); ... final Serde<String> stringSerde = Serdes.String(); final Serde<Long> longSerde = Serdes.Long(); final KStreamBuilder builder = new KStreamBuilder(); final KStream<String, String> textLines = builder.stream(stringSerde, stringSerde, "TextLinesTopic"); final Pattern pattern = Pattern.compile("W+", Pattern.UNICODE_CHARACTER_CLASS); final KTable<String, Long> wordCounts = textLines .flatMapValues(value -> Arrays.asList(pattern.split(value.toLowerCase()))) .groupBy((key, word) -> word) .count("Counts"); wordCounts.to(stringSerde, longSerde, "WordsWithCountsTopic"); final KafkaStreams streams = new KafkaStreams(builder, streamsConfiguration); streams.cleanUp(); streams.start(); Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
  • 39. 3939 Easy to Develop with, Easy to Test WordCountLambdaIntegrationTest.java EmbeddedSingleNodeKafkaCluster CLUSTER = new EmbeddedSingleNodeKafkaCluster(); ... CLUSTER.createTopic(inputTopic); ... Properties producerConfig = new Properties(); producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
  • 40. 4040 The Streams API of Apache Kafka®  No separate processing cluster required  Develop on Mac, Linux, Windows  Deploy to containers, VMs, bare metal, cloud  Powered by Kafka: elastic, scalable, distributed, battle-tested  Perfect for small, medium, large use cases  Fully integrated with Kafka security  Exactly-once processing semantics  Part of Apache Kafka, included in Confluent Open Source Write standard Java applications and microservices to process your data in real-time KStream<User, PageViewEvent> pageViews = builder.stream("pageviews-topic"); KTable<Windowed<User>, Long> viewsPerUserSession = pageViews .groupByKey() .count(SessionWindows.with(TimeUnit.MINUTES.toMillis(5)), "session-views"); https://docs.confluent.io/current/streams/
  • 41. 4141 KSQL: a Streaming SQL Engine for Apache Kafka® from Confluent  No coding required, all you need is SQL  No separate processing cluster required  Powered by Kafka: elastic, scalable, distributed, battle-tested CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3; CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.userid WHERE u.level = 'Platinum'; KSQL is the simplest way to process streams of data in real-time  Perfect for streaming ETL, anomaly detection, event monitoring, and more  Part of Confluent Open Source https://github.com/confluentinc/ksql
  • 42. Do you think that’s a table you are querying ?
  • 43. 4343 KSQL in less than 5 minutes https://www.youtube.com/watch?v=A45uRzJiv7I
  • 44. 4444 Confluent Enterprise: Logical Architecture Kafka Cluster Mainframe Kafka Connect Servers Kafka ConnectRDBMS Hadoop Cassandra Elasticsearch Kafka Connect Servers Kafka Connect Files Producer Application Consumer ApplicationZookeeper Kafka Broker REST Proxy Servers REST Proxy REST Client Control Center Servers Control Center Schema Registry Servers Schema Registry Kafka Producer APIs Kafka Consumer APIs Stream Processing Application 1 Stream Client Stream Processing Application 2 Stream Client REST Proxy Servers REST Proxy REST Client
  • 45. 4545 Confluent Enterprise: Physical Architecture Rack 1 Kafka Broker #1 ToR Switch ToR Switch Schema Registry #1 Kafka Connect #1 Zookeeper #1 REST Proxy #1 Kafka Broker #4 Zookeeper #4 Rack 2 Kafka Broker #2 ToR Switch ToR Switch Schema Registry #2 Kafka Connect #2 Zookeeper #2 Kafka Broker #5 Zookeeper #5 Rack 3 Kafka Broker #3 ToR Switch ToR Switch Kafka Connect #3 Zookeeper #3 Core Switch Core Switch REST Proxy #2 Load Balancer Load Balancer Control Center #1 Control Center #2
  • 46. 4646 Confluent Completes Kafka Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise Apache Kafka High throughput, low latency, high availability, secure distributed streaming platform Kafka Connect API Advanced API for connecting external sources/destinations into Kafka Kafka Streams API Simple library that enables streaming application development within the Kafka framework Additional Clients Supports non-Java clients; C, C++, Python, .NET and several others REST Proxy Provides universal access to Kafka from any network connected device via HTTP Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable Pre-Built Connectors HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified and supported by Confluent JMS Client Support for legacy Java Message Service (JMS) applications consuming and producing directly from Kafka Confluent Control Center Enables easy connector management, monitoring and alerting for a Kafka cluster Auto Data Balancer Rebalancing data across cluster to remove bottlenecks Replicator Multi-datacenter replication simplifies and automates MDC Kafka clusters Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365
  • 47. 4747 Big Data and Fast Data Ecosystems Synchronous Req/Response 0 – 100s ms Near Real Time > 100s ms Offline Batch > 1 hour Apache Kafka Stream Data Platform Search RDBMS Apps Monitoring Real-time Analytics NoSQL Stream Processing Apache Hadoop Data Lake Impala DWH Hive Spark Map-Reduce Confluent HDFS Connector (exactly once semantics) https://www.confluent.io/blog/the-value-of-apache-kafka-in-big-data-ecosystem/
  • 48. 4848 Building a Microservices Ecosystem with Kafka Streams and KSQL https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/ https://github.com/confluentinc/kafka-streams-examples/tree/3.3.0-post/src/main/java/io/confluent/examples/streams/microservices
  • 49. 4949 Microservices: References Blog posts series: Part 1: The Data Dichotomy: Rethinking the Way We Treat Data and Services https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/ Part 2: Build Services on a Backbone of Events https://www.confluent.io/blog/build-services-backbone-events/ Part 3: Using Apache Kafka as a Scalable, Event-Driven Backbone for Service Architectures https://www.confluent.io/blog/apache-kafka-for-service-architectures/ Part 4: Chain Services with Exactly Once Guarantees https://www.confluent.io/blog/chain-services-exactly-guarantees/ Part 5: Messaging as the Single Source of Truth https://www.confluent.io/blog/messaging-single-source-truth/ Part 6: Leveraging the Power of a Database Unbundled https://www.confluent.io/blog/leveraging-power-database-unbundled/ Part 7: Building a Microservices Ecosystem with Kafka Streams and KSQL https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/ Whitepaper: Microservices in the Apache Kafka™ Ecosystem https://www.confluent.io/resources/microservices-in-the-apache-kafka-ecosystem/
  • 50. 5050 Apache Kafka Security Security • Processes customer data • Regulatory requirements • Legal compliance • Internal security policies • Need is not limited to industries such as finance, healthcare, or governmental services Authentication • Scenario example: “Only certain applications may talk to the production Kafka cluster” • Client authentication via SASL – e.g. Kerberos, Active Directory Authorization • Scenario example: “Only certain applications may read data from sensitive Kafka topics” • Restrict who can create, write to, read from topics, and more Encryption • Scenario example: “Data-in-transit between apps and Kafka clusters must be encrypted” • SSL supported • Encrypts data exchanged between Kafka brokers, between Kafka brokers and Kafka clients/apps Help meeting security requirements by supporting:
  • 51. 5151 Enterprise Ready Multi-Datacenter Replication for Kafka Data Center in USA Kafka Cluster (USA) Kafka Broker 1 Kafka Broker 2 Kafka Broker 3 ZooKeeper 1 ZooKeeper 2 ZooKeeper 3 Control Center Kafka Connect Cluster Replicator 1 Replicator 2 Data Center in EMEA Kafka Cluster (EU) Kafka Broker 1 Kafka Broker 2 Kafka Broker 3 ZooKeeper 1 ZooKeeper 2 ZooKeeper 3 Control Center Kafka Connect Cluster Replicator 1 Replicator 2 Available only with Confluent Enterprise Apache Kafka and Confluent Open Source
  • 52. 5252 Cloud Synchronization and Migrations with Confluent Enterprise: Before DC1 DB2 DB1 DWH App2 App3 App4 KV2KV3 DB3 App2-v2 App5 App7 App1-v2 AWS App8 DWH App1 Challenges • Each team/department must execute their own cloud migration • May be moving the same data multiple times • Each box represented here require development, testing, deployment, monitoring and maintenance KV
  • 53. 5353 DC1 Cloud Synchronization and Migrations with Confluent Enterprise: After DB2 DB1 KV DWH App2 App4 KV2KV3 App2-v2 App5 App7 App1-v2 AWS App8 DWH App1 Kafka Kafka App3 Benefits • Continuous low-latency synchronization • Centralized manageability and monitoring – Track at event level data produced in all data centers • Security and governance – Track and control where data comes from and who is accessing it • Cost Savings – Move Data Once DB3
  • 54. 5454 About Confluent and Apache Kafka™ 70% of active Kafka Committers Founded September 2014 Technology developed while at LinkedIn Founded by the creators of Apache Kafka
  • 55. 5555 Apache Kafka: PMC members and committers https://kafka.apache.org/committers PMC PMC PMC PMCPMC PMC PMC PMC PMC PMC PMC
  • 56. 5656 Download Confluent Platform: the easiest way to get you started https://www.confluent.io/download/
  • 57. 5757 Books: get them all three in PDF format from Confluent website! https://www.confluent.io/apache-kafka-stream-processing-book-bundle
  • 58. 5858 Discount code: kacom17 Presented by https://kafka-summit.org/ Presented by