Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

Quix Streams — Kafka Summit 2023 | 1
Quix Streams
Building Real-Time Applications at Scale
Quix Streams — Kafka Summit 2023 | 2
Tomas Neubauer
Previously McLaren technical lead
CTO & Co-founder, Quix
Hello, nice to meet you! 👋
Quix Streams — Kafka Summit 2023 | 3
Racing
background
Roots in real-time data processing in the
most extreme, time-critical environment.
● 50,000 channels per car
● 1.5 kHz per channel
● 1,000s realtime models and simulations
Quix Streams — Kafka Summit 2023 | 3
Quix Streams — Kafka Summit 2023 | 4
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 5
Kafka
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 6
Streaming
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 7
Python
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 8
Goal
Crash detection
phone-data crashes
Fitness
app
Quix Streams — Kafka Summit 2023 | 9
● Architecture
● ML deployment
● Streaming landscape
● How it works
● Demo - Let's build it
Content
Quix Streams — Kafka Summit 2023 | 10
ML Deployment
Quix Streams — Kafka Summit 2023 | 10
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 11
phone-data
Websocket
gateway
alerts
Websocket
gateway
ANALYSIS & TRAINING
Trained
model
SageMaker
Architecture
Crash detection
Quix Streams — Kafka Summit 2023 | 12
ML Deployment with API
API REQUEST
WEB API
API RESPONSE
gX gY gZ gTotal
0.5 0.3 0.1 0.9
gX gY gZ gTotal Crash
0.5 0.3 0.1 0.9 1
SERVICE
Quix Streams — Kafka Summit 2023 | 13
Issues with REST APIs
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 13
Quix Streams — Kafka Summit 2023 | 14
Problems with REST API
API REQUEST
gX gY gZ gTotal
● CPU overhead
● Introducing delay
● Requests gets lost in case of service downtime or slow performance
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 15
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 16
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 17
Problems with REST API
API REQUEST
gX gY gZ gTotal
API REQUEST
gX gY gZ gTotal
WEB API
SERVICE
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 18
Stream processing
applications
An overview of stream
processing approaches
Quix Streams — Kafka Summit 2023 | 18
Quix Streams — Kafka Summit 2023 | 19
When you building stream processing applications with Kafka, there are two
options:
1. Just build an application that uses the Kafka producer and consumer APIs
directly
2. Adopt a full-fledged stream processing framework (Flink, Spark streaming,
Beam etc.)
Stream processing applications
Quix Streams — Kafka Summit 2023 | 20
● Works for simple stuff like one-message-at-a-time processing
● No external dependencies like JVM
● Gets very complicated when stateful processing is needed like calculation
aggregations or joining multiple streams
Kafka producer and consumer APIs
Quix Streams — Kafka Summit 2023 | 21
● Fully fledged stream processing frameworks solves stateful,
more complex operations
● But it is for a cost of increased complexity in many dimensions:
○ Java dependency
○ Deployment gets difficult because code is not running on its own but
in server side cluster (Flink cluster or Spark cluster)
○ Debugging is difficult
○ Performance optimization is difficult
○ Gets even worse when we combine synchronous architecture with
asynchronous in one application
Stream processing frameworks
Quix Streams — Kafka Summit 2023 | 22
JAR files…
Quix Streams — Kafka Summit 2023 | 23
Connecting Flink to Kafka is difficult
Quix Streams — Kafka Summit 2023 | 24
SQL looks easy to use but…
Quix Streams — Kafka Summit 2023 | 25
● Poor development experience
○ Logs only accessible from server, no debugging possible
● Performance hit caused by interface between JVM and Python
UDFs are nasty
Quix Streams — Kafka Summit 2023 | 26
DEBUGGING!!! 🐛🐛🐛
Quix Streams — Kafka Summit 2023 | 27
● Combining Kafka API approach with stream processing library
● Abstraction from key-value messages of Kafka API to virtual tables
● Standalone library that runs:
○ Locally for development and debugging
○ In docker or in Kubernetes for production deployments at scale
Is there a third way?
Quix Streams — Kafka Summit 2023 | 28
1. Messages in topic 2. Split messages into individual streams
4. Messages decomposed into rows
5. Memory state
updated from
incoming rows/series
6. State persistence
3. Message converted to tables
7. State and incoming data
is combined to output that
is sent to output topic
Commit offsets
Stateful processing with Pub & Sub client libraries
Quix Streams — Kafka Summit 2023 | 29
Quix Streams
1. Messages in topic 2. Messages decomposed as
rows available via pandas API
3. Messages processed
through pipeline defined as
pandas operations. Output
streamed to output topic.
● Automatic state management
● Automatic checkpointing
● Automatic message serialization/deserialization
Quix Streams — Kafka Summit 2023 | 30
How it works
Kafka + Kubernetes + Python
Quix Streams — Kafka Summit 2023 | 30
Quix Streams — Kafka Summit 2023 | 31
Our approach to stream processing
Containers
Containers running in
Kubernetes scaling hand
to hand with Kafka for
compute scalability.
Kafka
Handle your data reliably
and efficiently in memory
with Kafka. Using Kafka
partitions, replica system and
persistence to deliver
scalability and robustness.
Python
Python gives you flexibility.
It lets you transform data,
not just query it. From simple
filtering to ML use cases like
video processing.
Quix Streams — Kafka Summit 2023 | 32
Processing with streaming
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC
APP
OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 33
Scale
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 34
Fault tolerant
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 35
Let’s build it!
Demo
Quix Streams — Kafka Summit 2023 | 35
Quix Streams — Kafka Summit 2023 | 36
GitHub
Try Quix Streams
Quix Streams — Kafka Summit 2023 | 37
37
info@quix.io | www.quix.io
Thank you
1 de 37

Más contenido relacionado

Similar a Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

Kafka ExplainatonKafka Explainaton
Kafka ExplainatonNguyenChiHoangMinh
14 vistas53 diapositivas

Similar a Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer(20)

Kafka ExplainatonKafka Explainaton
Kafka Explainaton
NguyenChiHoangMinh14 vistas
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
Christina Lin217 vistas
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent4.9K vistas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
Florent Ramiere1.9K vistas

Más de HostedbyConfluent(20)

Streaming is a DetailStreaming is a Detail
Streaming is a Detail
HostedbyConfluent31 vistas

Último(20)

The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)
CSUC - Consorci de Serveis Universitaris de Catalunya51 vistas
MemVerge: Memory Viewer SoftwareMemVerge: Memory Viewer Software
MemVerge: Memory Viewer Software
CXL Forum115 vistas
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet48 vistas
CXL at OCPCXL at OCP
CXL at OCP
CXL Forum183 vistas
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman152 vistas

Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

  • 1. Quix Streams — Kafka Summit 2023 | 1 Quix Streams Building Real-Time Applications at Scale
  • 2. Quix Streams — Kafka Summit 2023 | 2 Tomas Neubauer Previously McLaren technical lead CTO & Co-founder, Quix Hello, nice to meet you! 👋
  • 3. Quix Streams — Kafka Summit 2023 | 3 Racing background Roots in real-time data processing in the most extreme, time-critical environment. ● 50,000 channels per car ● 1.5 kHz per channel ● 1,000s realtime models and simulations Quix Streams — Kafka Summit 2023 | 3
  • 4. Quix Streams — Kafka Summit 2023 | 4 Now raise your hand if you are using…
  • 5. Quix Streams — Kafka Summit 2023 | 5 Kafka Now raise your hand if you are using…
  • 6. Quix Streams — Kafka Summit 2023 | 6 Streaming Now raise your hand if you are using…
  • 7. Quix Streams — Kafka Summit 2023 | 7 Python Now raise your hand if you are using…
  • 8. Quix Streams — Kafka Summit 2023 | 8 Goal Crash detection phone-data crashes Fitness app
  • 9. Quix Streams — Kafka Summit 2023 | 9 ● Architecture ● ML deployment ● Streaming landscape ● How it works ● Demo - Let's build it Content
  • 10. Quix Streams — Kafka Summit 2023 | 10 ML Deployment Quix Streams — Kafka Summit 2023 | 10 REST API vs Streaming
  • 11. Quix Streams — Kafka Summit 2023 | 11 phone-data Websocket gateway alerts Websocket gateway ANALYSIS & TRAINING Trained model SageMaker Architecture Crash detection
  • 12. Quix Streams — Kafka Summit 2023 | 12 ML Deployment with API API REQUEST WEB API API RESPONSE gX gY gZ gTotal 0.5 0.3 0.1 0.9 gX gY gZ gTotal Crash 0.5 0.3 0.1 0.9 1 SERVICE
  • 13. Quix Streams — Kafka Summit 2023 | 13 Issues with REST APIs REST API vs Streaming Quix Streams — Kafka Summit 2023 | 13
  • 14. Quix Streams — Kafka Summit 2023 | 14 Problems with REST API API REQUEST gX gY gZ gTotal ● CPU overhead ● Introducing delay ● Requests gets lost in case of service downtime or slow performance WEB API SERVICE
  • 15. Quix Streams — Kafka Summit 2023 | 15 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 16. Quix Streams — Kafka Summit 2023 | 16 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 17. Quix Streams — Kafka Summit 2023 | 17 Problems with REST API API REQUEST gX gY gZ gTotal API REQUEST gX gY gZ gTotal WEB API SERVICE WEB API SERVICE
  • 18. Quix Streams — Kafka Summit 2023 | 18 Stream processing applications An overview of stream processing approaches Quix Streams — Kafka Summit 2023 | 18
  • 19. Quix Streams — Kafka Summit 2023 | 19 When you building stream processing applications with Kafka, there are two options: 1. Just build an application that uses the Kafka producer and consumer APIs directly 2. Adopt a full-fledged stream processing framework (Flink, Spark streaming, Beam etc.) Stream processing applications
  • 20. Quix Streams — Kafka Summit 2023 | 20 ● Works for simple stuff like one-message-at-a-time processing ● No external dependencies like JVM ● Gets very complicated when stateful processing is needed like calculation aggregations or joining multiple streams Kafka producer and consumer APIs
  • 21. Quix Streams — Kafka Summit 2023 | 21 ● Fully fledged stream processing frameworks solves stateful, more complex operations ● But it is for a cost of increased complexity in many dimensions: ○ Java dependency ○ Deployment gets difficult because code is not running on its own but in server side cluster (Flink cluster or Spark cluster) ○ Debugging is difficult ○ Performance optimization is difficult ○ Gets even worse when we combine synchronous architecture with asynchronous in one application Stream processing frameworks
  • 22. Quix Streams — Kafka Summit 2023 | 22 JAR files…
  • 23. Quix Streams — Kafka Summit 2023 | 23 Connecting Flink to Kafka is difficult
  • 24. Quix Streams — Kafka Summit 2023 | 24 SQL looks easy to use but…
  • 25. Quix Streams — Kafka Summit 2023 | 25 ● Poor development experience ○ Logs only accessible from server, no debugging possible ● Performance hit caused by interface between JVM and Python UDFs are nasty
  • 26. Quix Streams — Kafka Summit 2023 | 26 DEBUGGING!!! 🐛🐛🐛
  • 27. Quix Streams — Kafka Summit 2023 | 27 ● Combining Kafka API approach with stream processing library ● Abstraction from key-value messages of Kafka API to virtual tables ● Standalone library that runs: ○ Locally for development and debugging ○ In docker or in Kubernetes for production deployments at scale Is there a third way?
  • 28. Quix Streams — Kafka Summit 2023 | 28 1. Messages in topic 2. Split messages into individual streams 4. Messages decomposed into rows 5. Memory state updated from incoming rows/series 6. State persistence 3. Message converted to tables 7. State and incoming data is combined to output that is sent to output topic Commit offsets Stateful processing with Pub & Sub client libraries
  • 29. Quix Streams — Kafka Summit 2023 | 29 Quix Streams 1. Messages in topic 2. Messages decomposed as rows available via pandas API 3. Messages processed through pipeline defined as pandas operations. Output streamed to output topic. ● Automatic state management ● Automatic checkpointing ● Automatic message serialization/deserialization
  • 30. Quix Streams — Kafka Summit 2023 | 30 How it works Kafka + Kubernetes + Python Quix Streams — Kafka Summit 2023 | 30
  • 31. Quix Streams — Kafka Summit 2023 | 31 Our approach to stream processing Containers Containers running in Kubernetes scaling hand to hand with Kafka for compute scalability. Kafka Handle your data reliably and efficiently in memory with Kafka. Using Kafka partitions, replica system and persistence to deliver scalability and robustness. Python Python gives you flexibility. It lets you transform data, not just query it. From simple filtering to ML use cases like video processing.
  • 32. Quix Streams — Kafka Summit 2023 | 32 Processing with streaming SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC APP OUTPUT TOPIC PUB
  • 33. Quix Streams — Kafka Summit 2023 | 33 Scale SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 34. Quix Streams — Kafka Summit 2023 | 34 Fault tolerant SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 35. Quix Streams — Kafka Summit 2023 | 35 Let’s build it! Demo Quix Streams — Kafka Summit 2023 | 35
  • 36. Quix Streams — Kafka Summit 2023 | 36 GitHub Try Quix Streams
  • 37. Quix Streams — Kafka Summit 2023 | 37 37 info@quix.io | www.quix.io Thank you