SlideShare a Scribd company logo
1 of 72
1Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Introducing Kafka’s Streams API
Taking real-time processing to the mainstream
Michael Noll <michael@confluent.io>
Product manager, Confluent
2Apache Kafka meetup, Munich, Germany, Jan 25, 2017
3Apache Kafka meetup, Munich, Germany, Jan 25, 2017
4Apache Kafka meetup, Munich, Germany, Jan 25, 2017
5Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Our Dream Our Reality
6Confidential
Kafka’s Streams API
Taking real-time processing to the mainstream
7Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Taking real-time processing to the mainstream
Kafka’s Streams API
• Powerful yet easy-to-use library to build stream processing apps
• Apps are standard Java applications that run on client machines
• Part of open source Apache Kafka, introduced in 0.10+
• https://github.com/apache/kafka/tree/trunk/streams
Streams
API
Your App
Kafka
Cluster
8Apache Kafka meetup, Munich, Germany, Jan 25, 2017
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>0.10.1.1</version>
</dependency>
Build Applications, not Clusters!
9Apache Kafka meetup, Munich, Germany, Jan 25, 2017
“Cluster to go”: elastic, scalable, distributed, fault-tolerant, secure apps
10Apache Kafka meetup, Munich, Germany, Jan 25, 2017
”Database to go”: tables, state management, interactive queries
11Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Equally viable for S / M / L / XL / XXL use cases
Ok. Ok. Ok.
12Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Runs everywhere: from containers to cloud
13Apache Kafka meetup, Munich, Germany, Jan 25, 2017
When to use Kafka’s Streams API
Use case examples
• Customer 360-degree view
• Fleet or inventory management
• Fraud detection
• Real-time monitoring & intelligence
• Location-based marketing
• Virtual Reality (avatar replication)
• <and more>
To build real-time applications for your core business
Scenarios
• Microservices
• Fast Data apps for small and big data
• Reactive applications
• Continuous queries and
transformations
• Event-triggered processes
• The “T” in ETL
• <and more>
14Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Some public use cases in the wild & external articles
• Why Kafka Streams: towards a real-time streaming architecture, by Sky Betting and
Gaming
• http://engineering.skybettingandgaming.com/2017/01/23/streaming-architectures/
• Applying Kafka’s Streams API for social messaging at LINE Corp.
• http://developers.linecorp.com/blog/?p=3960
• Production pipeline at LINE, a social platform based in Japan with 220+ million users
• Microservices and Reactive Applications at Capital One
• https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with-kafka-streams
• Containerized Kafka Streams applications in Scala, by Hive Streaming
• https://www.madewithtea.com/processing-tweets-with-kafka-streams.html
• Geo-spatial data analysis
• http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/
• Language classification with machine learning
• https://dzone.com/articles/machine-learning-with-kafka-streams
15Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Kafka Summit NYC, May 09
Here, the community will share
latest Kafka Streams use cases.
http://kafka-summit.org/
16Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Do more with less
17Confidential
Architecture comparison: use case example
Real-time dashboard for security monitoring
“Which of my data centers are under attack?”
18Apache Kafka meetup, Munich, Germany, Jan 25, 2017
19Apache Kafka meetup, Munich, Germany, Jan 25, 2017
With Streams API
20Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Organizational benefits: decouple teams and roadmaps, scale people
21Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Available APIs
22Apache Kafka meetup, Munich, Germany, Jan 25, 2017
• API option 1: DSL (declarative)
KStream<Integer, Integer> input =
builder.stream("numbers-topic");
// Stateless computation
KStream<Integer, Integer> doubled =
input.mapValues(v -> v * 2);
// Stateful computation
KTable<Integer, Integer> sumOfOdds = input
.filter((k,v) -> v % 2 != 0)
.selectKey((k, v) -> 1)
.groupByKey()
.reduce((v1, v2) -> v1 + v2, "sum-of-odds");
The preferred API for most use cases.
9 out of 10 users pick the DSL.
Particularly appeals to:
• Fans of Scala, functional programming
• Users familiar with e.g. Spark
23Apache Kafka meetup, Munich, Germany, Jan 25, 2017
• API option 2: Processor API (imperative)
class PrintToConsoleProcessor
implements Processor<K, V> {
@Override
public void init(ProcessorContext context) {}
@Override
void process(K key, V value) {
System.out.println("Got value " + value);
}
@Override
void punctuate(long timestamp) {}
@Override
void close() {}
}
Full flexibility but more manual work
Appeals to:
• Users who require functionality that is
not yet available in the DSL
• Users familiar with e.g. Storm, Samza
• Still, check out the DSL!
24Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Writing and running your first application
• Preparation: Ensure Kafka cluster is accessible, has data to process
• Step 1: Write the application code in Java or Scala, see next slide
• Great starting point: https://github.com/confluentinc/examples
• Documentation: http://docs.confluent.io/current/streams/
• Step 2: Run the application
• During development: from your IDE, from CLI … (pro tip: Application Reset Tool is great for playing
around)
• In production: e.g. bundle as fat jar, then `java -cp my-fatjar.jar com.example.MyStreamsApp`
• http://docs.confluent.io/current/streams/developer-guide.html#running-a-kafka-streams-application
25Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Example: complete app, ready for production at large-scale
Word
Count
App configuration
Define processing
(here: WordCount)
Start processing
26Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key concepts
27Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key concepts
28Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key concepts
29Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key concepts
Kafka’s data model Kafka’s Streams API
30Confidential
Streams and Tables
Stream Processing meets Databases
31Apache Kafka meetup, Munich, Germany, Jan 25, 2017
32Apache Kafka meetup, Munich, Germany, Jan 25, 2017
33Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key observation: close relationship between Streams and Tables
http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple
http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
34Apache Kafka meetup, Munich, Germany, Jan 25, 2017
35Apache Kafka meetup, Munich, Germany, Jan 25, 2017
36Apache Kafka meetup, Munich, Germany, Jan 25, 2017
37Apache Kafka meetup, Munich, Germany, Jan 25, 2017
38Apache Kafka meetup, Munich, Germany, Jan 25, 2017
39Apache Kafka meetup, Munich, Germany, Jan 25, 2017
40Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features
41Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
42Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Native, 100% compatible Kafka integration
Read from Kafka
Write to Kafka
43Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
44Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Secure stream processing with the Streams API
• Your applications can leverage all client-side security features in Apache Kafka
• Security features include:
• Encrypting data-in-transit between applications and Kafka clusters
• Authenticating applications against Kafka clusters (“only some apps may talk to the production
cluster”)
• Authorizing application against Kafka clusters (“only some apps may read data from sensitive topics”)
Streams
API
Your App
Kafka
Cluster
”I’m the Payments app!” “Ok, you may read the Purchases topic.”
Data encryption
45Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
46Apache Kafka meetup, Munich, Germany, Jan 25, 2017
47Apache Kafka meetup, Munich, Germany, Jan 25, 2017
48Apache Kafka meetup, Munich, Germany, Jan 25, 2017
49Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
50Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Stateful computations
• Stateful computations include aggregations (e.g. counting), joins, and windowing
• State stores are the backbone of state management
• … are local for best performance
• … are continuously backed up to Kafka to enable elasticity and fault-tolerance
• ... are per stream task for isolation, think: share-nothing
• Pluggable storage engines
• Default: RocksDB (a key-value store) to allow for local state that is larger than available RAM
• You can also use your own storage engine
• From the user perspective
• DSL: no need to worry about anything, state management is automatically being done for you
• Processor API: direct access to state stores – very flexible but more manual work
51Apache Kafka meetup, Munich, Germany, Jan 25, 2017
52Apache Kafka meetup, Munich, Germany, Jan 25, 2017
53Apache Kafka meetup, Munich, Germany, Jan 25, 2017
54Apache Kafka meetup, Munich, Germany, Jan 25, 2017
55Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Use case: real-time, distributed joins at large scale
56Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Use case: real-time, distributed joins at large scale
57Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Use case: real-time, distributed joins at large scale
58Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
• Interactive queries
59Apache Kafka meetup, Munich, Germany, Jan 25, 2017
60Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
• Interactive queries
• Time model
61Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Time
62Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Time
A
C
B
63Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
• Interactive queries
• Time model
• Supports late-arriving and out-of-order data
64Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Out-of-order and late-arriving data: example
Users with mobile phones enter
airplane, lose Internet connectivity
Emails are being written
during the 8h flight
Internet connectivity is restored,
phones will send queued emails now,
though with an 8h delay
Bob writes Alice an
email at 2 P.M.
Bob’s email is finally
being sent at 10 P.M.
65Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
• Interactive queries
• Time model
• Supports late-arriving and out-of-order data
• Windowing
66Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Windowing
• To group related events in a stream
• Use case examples:
• Time-based analysis of ad impressions (”number of ads clicked in the past hour”)
• Monitoring statistics of telemetry data (“1min/5min/15min averages”)
• Analyzing user browsing sessions on a news site
Input data, where
colors represent
different users events
Rectangles denote
different event-time
windows
processing-time
event-time
windowing
alice
bob
dave
67Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Secure stream processing using Kafka’s security features
• Elastic and highly scalable
• Fault-tolerant
• Stateful and stateless computations
• Interactive queries
• Time model
• Supports late-arriving and out-of-order data
• Windowing
• Millisecond processing latency, no micro-batching
• At-least-once processing guarantees (exactly-once is in the works as we speak)
68Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Roadmap Outlook
69Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Roadmap outlook for Kafka Streams
Upcoming in Confluent 3.2 & Apache Kafka 0.10.2
• Sessionization aka “session windows” -- e.g. for analyzing user browsing behavior
• Global KTables (vs. today’s partitioned KTables) – e.g. for convenient facts-to-dimensions joins
• Now you can use newer versions of the Streams API against older clusters, too
• Further operational metrics to improve monitoring and 24x7 operations of apps
Feature highlight for 2017
• Exactly-Once processing semantics
• But much more to come!
70Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Wrapping Up
71Apache Kafka meetup, Munich, Germany, Jan 25, 2017
Where to go from here
• Kafka’s Streams API is available in Confluent Platform 3.1 and in Apache Kafka 0.10.1
• http://www.confluent.io/download
• Demo applications: https://github.com/confluentinc/examples
• Interactive Queries, Joins, Security, Windowing, Avro integration, …
• Confluent documentation: http://docs.confluent.io/current/streams/
• Quickstart, Concepts, Architecture, Developer Guide, FAQ
• Recorded talks
• Introduction to Kafka’s Streams API:
http://www.youtube.com/watch?v=o7zSLNiTZbA
• Application Development and Data in the Emerging World of Stream Processing (higher level talk):
https://www.youtube.com/watch?v=JQnNHO5506w
72Confidential
Thank You

More Related Content

What's hot

0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019confluent
 
A Streaming Platform Architecture Based on Apache Kafka
A Streaming Platform Architecture Based on Apache KafkaA Streaming Platform Architecture Based on Apache Kafka
A Streaming Platform Architecture Based on Apache Kafkaconfluent
 
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...confluent
 
Confluent building a real-time streaming platform using kafka streams and k...
Confluent   building a real-time streaming platform using kafka streams and k...Confluent   building a real-time streaming platform using kafka streams and k...
Confluent building a real-time streaming platform using kafka streams and k...Thomas Alex
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKai Wähner
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRconfluent
 
Intro to AsyncAPI
Intro to AsyncAPIIntro to AsyncAPI
Intro to AsyncAPIconfluent
 
Data Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache KafkaData Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache Kafkaconfluent
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connectKnoldus Inc.
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyKairo Tavares
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Integrating Apache Kafka Into Your Environment
Integrating Apache Kafka Into Your EnvironmentIntegrating Apache Kafka Into Your Environment
Integrating Apache Kafka Into Your Environmentconfluent
 
The Many Faces of Apache Kafka: Leveraging real-time data at scale
The Many Faces of Apache Kafka: Leveraging real-time data at scaleThe Many Faces of Apache Kafka: Leveraging real-time data at scale
The Many Faces of Apache Kafka: Leveraging real-time data at scaleNeha Narkhede
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data confluent
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...confluent
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAlex Palamides
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformconfluent
 

What's hot (20)

0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
 
A Streaming Platform Architecture Based on Apache Kafka
A Streaming Platform Architecture Based on Apache KafkaA Streaming Platform Architecture Based on Apache Kafka
A Streaming Platform Architecture Based on Apache Kafka
 
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
 
Confluent building a real-time streaming platform using kafka streams and k...
Confluent   building a real-time streaming platform using kafka streams and k...Confluent   building a real-time streaming platform using kafka streams and k...
Confluent building a real-time streaming platform using kafka streams and k...
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
 
Riding the Streaming Wave DIY style
Riding the Streaming Wave  DIY styleRiding the Streaming Wave  DIY style
Riding the Streaming Wave DIY style
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
 
Intro to AsyncAPI
Intro to AsyncAPIIntro to AsyncAPI
Intro to AsyncAPI
 
Data Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache KafkaData Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache Kafka
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connect
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Integrating Apache Kafka Into Your Environment
Integrating Apache Kafka Into Your EnvironmentIntegrating Apache Kafka Into Your Environment
Integrating Apache Kafka Into Your Environment
 
The Many Faces of Apache Kafka: Leveraging real-time data at scale
The Many Faces of Apache Kafka: Leveraging real-time data at scaleThe Many Faces of Apache Kafka: Leveraging real-time data at scale
The Many Faces of Apache Kafka: Leveraging real-time data at scale
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using R
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platform
 

Viewers also liked

Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Michael Noll
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration StoryJoan Viladrosa Riera
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration storyJoan Viladrosa Riera
 

Viewers also liked (6)

Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
 

Similar to Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017

Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams APIconfluent
 
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Kai Wähner
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...Codemotion
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Kai Wähner
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaKafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaEno Thereska
 
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...HostedbyConfluent
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Kai Wähner
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesKai Wähner
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Kai Wähner
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...confluent
 
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLCouchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLDATAVERSITY
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Portable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache BeamPortable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache Beamconfluent
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...confluent
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 

Similar to Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017 (20)

Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
Apache Kafka Open Source Ecosystem for Machine Learning at Extreme Scale (Apa...
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaKafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache Kafka
 
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
 
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
Streaming Machine Learning with Python, Jupyter, TensorFlow, Apache Kafka and...
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
 
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
 
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLCouchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Portable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache BeamPortable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache Beam
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 

Recently uploaded

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Recently uploaded (20)

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017

  • 1. 1Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Introducing Kafka’s Streams API Taking real-time processing to the mainstream Michael Noll <michael@confluent.io> Product manager, Confluent
  • 2. 2Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 3. 3Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 4. 4Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 5. 5Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Our Dream Our Reality
  • 6. 6Confidential Kafka’s Streams API Taking real-time processing to the mainstream
  • 7. 7Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Taking real-time processing to the mainstream Kafka’s Streams API • Powerful yet easy-to-use library to build stream processing apps • Apps are standard Java applications that run on client machines • Part of open source Apache Kafka, introduced in 0.10+ • https://github.com/apache/kafka/tree/trunk/streams Streams API Your App Kafka Cluster
  • 8. 8Apache Kafka meetup, Munich, Germany, Jan 25, 2017 <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.1.1</version> </dependency> Build Applications, not Clusters!
  • 9. 9Apache Kafka meetup, Munich, Germany, Jan 25, 2017 “Cluster to go”: elastic, scalable, distributed, fault-tolerant, secure apps
  • 10. 10Apache Kafka meetup, Munich, Germany, Jan 25, 2017 ”Database to go”: tables, state management, interactive queries
  • 11. 11Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Equally viable for S / M / L / XL / XXL use cases Ok. Ok. Ok.
  • 12. 12Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Runs everywhere: from containers to cloud
  • 13. 13Apache Kafka meetup, Munich, Germany, Jan 25, 2017 When to use Kafka’s Streams API Use case examples • Customer 360-degree view • Fleet or inventory management • Fraud detection • Real-time monitoring & intelligence • Location-based marketing • Virtual Reality (avatar replication) • <and more> To build real-time applications for your core business Scenarios • Microservices • Fast Data apps for small and big data • Reactive applications • Continuous queries and transformations • Event-triggered processes • The “T” in ETL • <and more>
  • 14. 14Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Some public use cases in the wild & external articles • Why Kafka Streams: towards a real-time streaming architecture, by Sky Betting and Gaming • http://engineering.skybettingandgaming.com/2017/01/23/streaming-architectures/ • Applying Kafka’s Streams API for social messaging at LINE Corp. • http://developers.linecorp.com/blog/?p=3960 • Production pipeline at LINE, a social platform based in Japan with 220+ million users • Microservices and Reactive Applications at Capital One • https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with-kafka-streams • Containerized Kafka Streams applications in Scala, by Hive Streaming • https://www.madewithtea.com/processing-tweets-with-kafka-streams.html • Geo-spatial data analysis • http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/ • Language classification with machine learning • https://dzone.com/articles/machine-learning-with-kafka-streams
  • 15. 15Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Kafka Summit NYC, May 09 Here, the community will share latest Kafka Streams use cases. http://kafka-summit.org/
  • 16. 16Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Do more with less
  • 17. 17Confidential Architecture comparison: use case example Real-time dashboard for security monitoring “Which of my data centers are under attack?”
  • 18. 18Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 19. 19Apache Kafka meetup, Munich, Germany, Jan 25, 2017 With Streams API
  • 20. 20Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Organizational benefits: decouple teams and roadmaps, scale people
  • 21. 21Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Available APIs
  • 22. 22Apache Kafka meetup, Munich, Germany, Jan 25, 2017 • API option 1: DSL (declarative) KStream<Integer, Integer> input = builder.stream("numbers-topic"); // Stateless computation KStream<Integer, Integer> doubled = input.mapValues(v -> v * 2); // Stateful computation KTable<Integer, Integer> sumOfOdds = input .filter((k,v) -> v % 2 != 0) .selectKey((k, v) -> 1) .groupByKey() .reduce((v1, v2) -> v1 + v2, "sum-of-odds"); The preferred API for most use cases. 9 out of 10 users pick the DSL. Particularly appeals to: • Fans of Scala, functional programming • Users familiar with e.g. Spark
  • 23. 23Apache Kafka meetup, Munich, Germany, Jan 25, 2017 • API option 2: Processor API (imperative) class PrintToConsoleProcessor implements Processor<K, V> { @Override public void init(ProcessorContext context) {} @Override void process(K key, V value) { System.out.println("Got value " + value); } @Override void punctuate(long timestamp) {} @Override void close() {} } Full flexibility but more manual work Appeals to: • Users who require functionality that is not yet available in the DSL • Users familiar with e.g. Storm, Samza • Still, check out the DSL!
  • 24. 24Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Writing and running your first application • Preparation: Ensure Kafka cluster is accessible, has data to process • Step 1: Write the application code in Java or Scala, see next slide • Great starting point: https://github.com/confluentinc/examples • Documentation: http://docs.confluent.io/current/streams/ • Step 2: Run the application • During development: from your IDE, from CLI … (pro tip: Application Reset Tool is great for playing around) • In production: e.g. bundle as fat jar, then `java -cp my-fatjar.jar com.example.MyStreamsApp` • http://docs.confluent.io/current/streams/developer-guide.html#running-a-kafka-streams-application
  • 25. 25Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Example: complete app, ready for production at large-scale Word Count App configuration Define processing (here: WordCount) Start processing
  • 26. 26Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key concepts
  • 27. 27Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key concepts
  • 28. 28Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key concepts
  • 29. 29Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key concepts Kafka’s data model Kafka’s Streams API
  • 30. 30Confidential Streams and Tables Stream Processing meets Databases
  • 31. 31Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 32. 32Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 33. 33Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key observation: close relationship between Streams and Tables http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
  • 34. 34Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 35. 35Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 36. 36Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 37. 37Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 38. 38Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 39. 39Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 40. 40Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features
  • 41. 41Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration
  • 42. 42Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Native, 100% compatible Kafka integration Read from Kafka Write to Kafka
  • 43. 43Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features
  • 44. 44Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Secure stream processing with the Streams API • Your applications can leverage all client-side security features in Apache Kafka • Security features include: • Encrypting data-in-transit between applications and Kafka clusters • Authenticating applications against Kafka clusters (“only some apps may talk to the production cluster”) • Authorizing application against Kafka clusters (“only some apps may read data from sensitive topics”) Streams API Your App Kafka Cluster ”I’m the Payments app!” “Ok, you may read the Purchases topic.” Data encryption
  • 45. 45Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant
  • 46. 46Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 47. 47Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 48. 48Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 49. 49Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations
  • 50. 50Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Stateful computations • Stateful computations include aggregations (e.g. counting), joins, and windowing • State stores are the backbone of state management • … are local for best performance • … are continuously backed up to Kafka to enable elasticity and fault-tolerance • ... are per stream task for isolation, think: share-nothing • Pluggable storage engines • Default: RocksDB (a key-value store) to allow for local state that is larger than available RAM • You can also use your own storage engine • From the user perspective • DSL: no need to worry about anything, state management is automatically being done for you • Processor API: direct access to state stores – very flexible but more manual work
  • 51. 51Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 52. 52Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 53. 53Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 54. 54Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 55. 55Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Use case: real-time, distributed joins at large scale
  • 56. 56Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Use case: real-time, distributed joins at large scale
  • 57. 57Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Use case: real-time, distributed joins at large scale
  • 58. 58Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations • Interactive queries
  • 59. 59Apache Kafka meetup, Munich, Germany, Jan 25, 2017
  • 60. 60Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations • Interactive queries • Time model
  • 61. 61Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Time
  • 62. 62Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Time A C B
  • 63. 63Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations • Interactive queries • Time model • Supports late-arriving and out-of-order data
  • 64. 64Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Out-of-order and late-arriving data: example Users with mobile phones enter airplane, lose Internet connectivity Emails are being written during the 8h flight Internet connectivity is restored, phones will send queued emails now, though with an 8h delay Bob writes Alice an email at 2 P.M. Bob’s email is finally being sent at 10 P.M.
  • 65. 65Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations • Interactive queries • Time model • Supports late-arriving and out-of-order data • Windowing
  • 66. 66Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Windowing • To group related events in a stream • Use case examples: • Time-based analysis of ad impressions (”number of ads clicked in the past hour”) • Monitoring statistics of telemetry data (“1min/5min/15min averages”) • Analyzing user browsing sessions on a news site Input data, where colors represent different users events Rectangles denote different event-time windows processing-time event-time windowing alice bob dave
  • 67. 67Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Key features in 0.10 • Native, 100%-compatible Kafka integration • Secure stream processing using Kafka’s security features • Elastic and highly scalable • Fault-tolerant • Stateful and stateless computations • Interactive queries • Time model • Supports late-arriving and out-of-order data • Windowing • Millisecond processing latency, no micro-batching • At-least-once processing guarantees (exactly-once is in the works as we speak)
  • 68. 68Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Roadmap Outlook
  • 69. 69Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Roadmap outlook for Kafka Streams Upcoming in Confluent 3.2 & Apache Kafka 0.10.2 • Sessionization aka “session windows” -- e.g. for analyzing user browsing behavior • Global KTables (vs. today’s partitioned KTables) – e.g. for convenient facts-to-dimensions joins • Now you can use newer versions of the Streams API against older clusters, too • Further operational metrics to improve monitoring and 24x7 operations of apps Feature highlight for 2017 • Exactly-Once processing semantics • But much more to come!
  • 70. 70Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Wrapping Up
  • 71. 71Apache Kafka meetup, Munich, Germany, Jan 25, 2017 Where to go from here • Kafka’s Streams API is available in Confluent Platform 3.1 and in Apache Kafka 0.10.1 • http://www.confluent.io/download • Demo applications: https://github.com/confluentinc/examples • Interactive Queries, Joins, Security, Windowing, Avro integration, … • Confluent documentation: http://docs.confluent.io/current/streams/ • Quickstart, Concepts, Architecture, Developer Guide, FAQ • Recorded talks • Introduction to Kafka’s Streams API: http://www.youtube.com/watch?v=o7zSLNiTZbA • Application Development and Data in the Emerging World of Stream Processing (higher level talk): https://www.youtube.com/watch?v=JQnNHO5506w

Editor's Notes

  1. Q1: Who knows Kafka? Runs Kafka in production? Q2: Who knows about Kafka’s Streams API?
  2. Imagine we want to model and run a hotel. Here, one of the roles/responsibilities we need to put in place is that of a security guard that guards access to the building. In the world of software engineering, we may opt to implement the security guard as a microservice that runs on a single machine.
  3. However, there are many more roles we need, and thus more microservices.
  4. And, as our hotel business is growing, a single security guard is not sufficient any longer – we need a security team! At this point our microservice needs to scale, and we need many machines instead of a single one.
  5. But alas, up to now what we described in the slides before has been very challenging in practice, notably because of the tools available to us as software engineers. There’s a tremendous discrepancy between how we want to work and how we actually do work.
  6. Interactive Queries is a feature of Streams that lets you directly access the latest processing results from other applications. For many use cases, this means you no longer need to operate and interface with external database that run next to your applications.
  7. We can also share further, non-public information on request (may require NDA).
  8. Because it’s often about core business applications, many enterprise users of Kafka Streams don’t like to speak about it publicly. However, some use cases will be shared at the upcoming Kafka Summit NYC.
  9. ~ at 10mins
  10. ~ at 10mins
  11. A common architecture to implement such a real-time dashboard would be the following: 1. Capture events in real-time into Kafka. 2. Set up a processing cluster like Spark, Storm. 3. Write and submit a processing “job” to the cluster. 4. Store the processing results into an external system or database. 5. Present the processing results to the end user via a front-end dashboard application. Drawback: Lots of complexity, many moving parts. This architecture is often more part of the problem than part of the solution.
  12. “Spot the app” (the blue parts) Some stream processing technologies (such as Kafka/Streams) allow you to implement this use case in a much more simplified manner: through “normal” applications! Much simpler deployment and upgrade model, clear ownership rather than split ownership w/ your app spread all over the place. For example, in our previous online talk "Demystifying Stream Processing with Apache Kafka” we explained how Kafka’s Streams API allows you to build stream processing applications with “cluster to-go” and “database to-go” functionality, i.e. all the benefits of “cluster technology” but without the drawbacks of cluster technologies (such as having to understand, install, operate, and integrate all these clusters).
  13. Empower teams in LOB to innovate and be agile Decouple LOB from infrastructure teams Cf. Conway’s law: “organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations”
  14. This is a full-fledged application that you can run in production like this (and e.g. containerize) to process millions of messages per second! And it can be run across 1 / 10 / 100 app instances or client machines!
  15. Kafka Streams has a very similar data model as Kafka Core. (We won’t need to go into further details here.)
  16. The processing logic of your application is defined through a processor topology, which consists of nodes (stream processors = stream processing steps) and edges (streams) that connect the nodes. If you use the DSL, all of this is hidden from you. If you use the Processor API, you can (and must) define the topology manually.
  17. Kafka partitions data for storing and transporting it. Kafka Streams partitions data for processing it. In both cases, this partitioning enables data locality, elasticity, scalability, high performance, and fault-tolerance. That’s all we want to mention at this point because we want to rather talk about something more interesting and more important: the stream-table duality.
  18. Streams of data are everywhere in your company
  19. But databases and tables, too, are everywhere. There’s a reason why we have been leveraging databases and tables for decades!
  20. Interactive Queries is a feature of Streams that lets you directly access the latest processing results from other applications. For many use cases, this means you no longer need to operate and interface with external database that run next to your applications.
  21. We might want to predict which of our customers that are currently in transit to Europe are at risk of missing their connecting flights.
  22. Also: you want to do such joins at scale (>= 1M msg/s)!
  23. Most use cases for stream processing in practice require both Streams + Tables Kafka ships with first-class support for Streams + Tables. Benefits include: simplified architectures, less moving pieces, less Do-It-Yourself work
  24. And yes, all this does work at scale, too!
  25. DSL “Just works” DSL abstracts state stores away from you Stateful operations include count(), reduce(), aggregate(), … Processor API You have direct access to state stores Very flexible but more manual work for you Streams supports another built-in storage engine: an in-memory store
  26. An application’s state is automatically migrated from a failed client machine to a working client machine – thanks to being able to restore state from “incremental” backups that are performed continuously in the background.
  27. Remember the stream-table duality we talked about earlier? A state store is also a kind of table, so we can and do log any changes to the various state stores into Kafka (“changelog streams”). Think: “Incremental backups” of your state.
  28. Now if one of the application instances is taken down (e.g. maintenance, reducing the capacity of the app to save operating costs) / fails (e.g. machine crash), then we can restore the app instance’s local state on the remaining live instances from the incremental backups. What really happens here is that Kafka Streams will migrate any stream task(s) from the terminated instance to the remaining live instances, and each stream task has its own state store(s), assigned partitions, and so on. Note that, in practice, more than just one app instances will typically take over the work – i.e. stream tasks – from the terminated instance. The diagram above is, as highlighted, a simplification.
  29. Drawbacks: increased latency, coupling of scalability and availability of your app with that of the DB, …
  30. We kinda “lift” the database into your application, where it can do highly performant local lookups.
  31. And again, elasiticity/scalability is provided thanks to being able to backup/restore/migrate tasks + state across running instances.
  32. Why is this so useful? Because, for many use cases, you no longer need an external database/system to handover data between your stream processing application and other applications that are running downstream.
  33. Essential feature to allow for correct *Re-Processing* of your data (e.g. to fix a production bug)!
  34. This is very common in practice, not a rare corner case! We want control over how out-of-order data is handled, and this handling must be *efficient*. School example: if you arrived 5 minutes late, still ok for the teacher – but even later: warning! Example: We process data in 5-minute windows, e.g. compute statistics - Option A: When event arrives 1 minute late: *update* the original result! - Option B: When event arrives 2 hours late: *discard* it!