What does a Kafka administrator need to do if they have a user who demands that message delivery be guaranteed, fast, and low cost? In this talk we walk through the architecture we created to deliver for such users. Learn around the alternatives we considered and the pros and cons around what we came up with.
In this talk, we’ll be forced to dive into broker restart and failure scenarios and things we need to do to prevent leader elections from slowing down incoming requests. We’ll need to take care of the consumers as well to ensure that they don’t process the same request twice. We also plan to describe our architecture by showing a demo of simulated requests being produced into Kafka clusters and consumers processing them in lieu of us aggressively causing failures on the Kafka clusters.
We hope the audience walks away with a deeper understanding of what it takes to build robust Kafka clients and how to tune them to accomplish stringent delivery guarantees.
2. Agenda
Who Are We?
Rides Without Kafka
Rides With PubSub Architecture + Demo
Hardened PubSub Architecture + Demo
Future Work
3. Who Are We?
Andrey Falko
● Launched Lyft’s internal Kafka platform
● Enjoys hiking, skiing, and kayaking
● Discovered that he is a cat person
Can Cecen
● Launched MSK and now taking Lyft’s
Kafka platform to the next level
● Outside coding, uses fingers to play
bass guitar
● Has a dog, but she’s more like a cat.
Dogcat person?
31. Security Updates
Need to update Linux Kernel
Network, Host, and Operator Errors
Kafka Version Upgrades
Disaster Recovery
Kafka Operations - Restarts
38. Make switch from Kafka’s ZNodes
Reuse remediator <https://lft.to/305FyTg>
Fast Cluster Switch
39. Envoy as Kafka Proxy
KIP-559 - Make it easier to intercept Kafka protocol traffic
Envoy gives us:
Observability
Fault injection and a framework to implement health
checks
Service Discovery
42. Distributed logs are good for state machines
Take higher p50 to gain lower p99 latency
Invest in automation to support multi-cluster Kafka
Takeaways
43. Thank You
Thank you!
github.com/afalko
Andrey Falko <afalko@lyft.com>
linkedin.com/in/andrey-falko/
github.com/cancecen
Can Cecen <ccecen@lyft.com>
linkedin.com/in/yilmazcancecen
https://bit.ly/32ufowH
46. Kafka on K8s with Envoy?
Stateless web service model
Svc 1 Svc 2
Kubelet 1 Kubelet 2
Service Discovery
Envoy Envoy
47. Kafka on K8s with Envoy?
Stateless web service model
Svc 1 Svc 2
Kubelet 1 Kubelet 2
Service Discovery
Envoy Envoy
I’m Svc 1 at 10.1.1.10 I’m Svc 2 at 10.1.1.20
48. Kafka on K8s with Envoy?
Stateless web service model
Svc 1 Svc 2
Kubelet 1 Kubelet 2
Service Discovery
Envoy Envoy
localhost:svc-port
49. Kafka on K8s with Envoy?
Stateless web service model
Svc 1 Svc 2
Kubelet 1 Kubelet 2
Service Discovery
Envoy Envoy
localhost:svc-port
Svc 2 at 10.1.1.20
50. Kafka on K8s with Envoy?
Stateless web service model
Svc 1 Svc 2
Kubelet 1 Kubelet 2
Service Discovery
Envoy Envoy
localhost:svc-port
10.1.1.20:svc-port
51. What about Kafka with Envoy?
Client-side can handle bootstrap
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
52. What about Kafka with Envoy?
Client-side can handle bootstrap
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
localhost:9092
Kafka at 10.1.1.20,
10.1.1.21, 10.1.123
10.1.1.21:9092
53. What about Kafka with Envoy?
Client-side can handle bootstrap
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
localhost:9092
Kafka at 10.1.1.20,
10.1.1.21, 10.1.123
10.1.1.21:9092
Advertised
Listener:
10.1.1.21:9092
54. What about Kafka with Envoy?
Client-side can handle bootstrap
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
Cluster Metadata
Kafka at 10.1.1.20,
10.1.1.21, 10.1.123
Cluster Metadata
Cluster
Metadata
55. What about Kafka with Envoy?
Model breaks: client-side Envoy bypassed
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
Write to topic
example-0 w/ leader:
10.1.1.23:9092
56. What about Kafka with Envoy?
Model breaks: client-side Envoy bypassed
Client Kafka
Kubelet 1
Kubelets
Service Discovery
Envoy Envoy
Write to topic
example-0 w/ leader:
10.1.1.23:9092
MTLS
57. What about Kafka with Envoy?
Server-side 2x connections
Kafka
Kubelets
EnvoyWrite to topic
example-0 w/ leader:
10.1.1.23:9092
Client
Envoy