Apache Kafka and Apache Flink together are a winning stack for data analytics that is used by many companies across industries.
The two projects complement each other perfectly: Kafka offers a world-class log for event stream storage and transport, while Flink is a powerful system for analytics and applications on top of those event streams.
This talk will demonstrate how to use Kafka and Flink together for "unified analytics": Analytics that seamlessly combine processing of real-time data and historic data.
Using SQL as the language for our sample applications, we will walk though various scenarios for unified analytics, such as
- Running the same query for processing real-time data from Kafka and for batch-accelerated processing of the historic data stored in Kafka.
- Writing queries that combine data in Kafka with tables in external systems (like S3)
- Switching between streams of historic data (from S3) and real-time streams in Kafka.
The audience will learn how combining real-time and historic data is becoming convenient with the combination of Kafka and Flink.