Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Friends don't let friends do dual writes: Outbox pattern with OpenShift Streams for Apache Kafka and Debezium | DevNation Tech Talk

Friends don't let friends do dual writes: Outbox pattern with OpenShift Streams for Apache Kafka and Debezium | DevNation Tech Talk

Descargar para leer sin conexión

Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.

OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.

In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:

Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.

Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix.

OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases.

In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to:

Provision a Kafka cluster on OpenShift Streams for Apache Kafka.
Deploy and configure Debezium to use OpenShift Streams for Apache Kafka.
Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.

Más Contenido Relacionado

Más de Red Hat Developers

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Friends don't let friends do dual writes: Outbox pattern with OpenShift Streams for Apache Kafka and Debezium | DevNation Tech Talk

  1. 1. Outbox Pattern with OpenShift Streams for Apache Kafka and Debezium Bernard Tison - Red Hat
  2. 2. ● Open-source distributed event streaming platform ● Horizontally-scalable, fault-tolerant, commit log ● Use cases: ○ Streaming ETL ○ Real-time analytics ○ Distributed event-driven applications ○ Edge and hybrid scenarios ● Rich ecosystem ○ Kafka Connect ○ Kafka Streams ○ MirrorMaker ○ Schema Registry Apache Kafka https://red.ht/TryKafka
  3. 3. ● Messages are sent to and received from a topic ○ Topics are split into one or more partitions (aka shards) ○ All actual work is done on partition level, topic is just a virtual object ● Each message is written only into a one selected partition ○ Partitioning is usually done based on the message key ○ Message ordering within the partition is fixed ● Retention ○ Based on size / message age ○ Compacted based on message key Apache Kafka https://red.ht/TryKafka
  4. 4. Red Hat OpenShift Streams for Apache Kafka Streamlined developer experience: a curated solution with a developer-first, consistent experience Delivered as a service, managed by Red Hat SRE - 24x7 global support and a 99.95% service-level agreement (SLA) Real-time, streaming data broker - Dedicated Apache Kafka cluster delivered as a service in the cloud and location of choice ▸ Access to Kafka brokers, topics, and partitions ▸ Configuration management ▸ Metrics and monitoring ▸ UI / CLI / API / service bindings ▸ Integrated identity & access management Metrics & monitoring Configuration mgmt Hosted & managed (99.95% SLA) Kafka cluster Broker(s) Topic(s) STREAMLINED DEVELOPER EXPERIENCE UI service binding API CLI https://red.ht/TryKafka
  5. 5. ● Current Status: Development Preview ● GA towards the end of the year ● Try it for free! Streams for Apache Kafka Development Preview cloud.redhat.com Sign-in RHOSAK developers page Spin-up a Kafka cluster User get access to Kafka UI Develop and deploy Kafka-based applications https://red.ht/TryKafka https://red.ht/TryKafka
  6. 6. ● Open-source tool to reliably and scalably stream data between Kafka and other systems ● Source and sink connectors Kafka Connect https://red.ht/TryKafka
  7. 7. ● Open source Change Data Capture platform ● CDC captures row-level changes to database tables and passes corresponding change events to a data streaming bus ● Transaction log based CDC ○ All data changes are captured ○ No polling delay or overhead ○ Transparent to writing applications and models ○ Can capture deletes ○ Can capture old record state and further meta data ● Supported Databases: MySQL, PostgreSQL, SQLServer, DB2, Oracle ● Deployed as source Kafka Connect connector ● www.debezium.io Debezium https://red.ht/TryKafka
  8. 8. Debezium use cases ● Data Replication ○ Replicate data to another Database ○ Feed Analytics system, Data Lake or DWH ● Microservices ○ Excellent fit for microservices architectures: ■ Propagate data between services without coupling ■ Each service keeps optimized views locally ● Other use cases ○ Auditing / Historization ○ Update / Invalidate caches ○ Enable full-text search ○ Update CQRS read models https://red.ht/TryKafka
  9. 9. Debezium Data Pipeline https://red.ht/TryKafka
  10. 10. ● Dual write: system needs to update different resources within one business transaction ● Frequent issue in distributed event-driven applications ● Service needs to persist state in its local data store and notify other services of the state change Dual write https://red.ht/TryKafka
  11. 11. Dual write https://red.ht/TryKafka
  12. 12. ● Solution: ○ Modify only one of the resources ○ This drives the update of the second resource in an eventual consistent manner ● Outbox pattern: ○ Service persists state change together with message payload in its data store ○ Debezium captures changes in the outbox table and transforms into a Kafka message Outbox Pattern https://red.ht/TryKafka
  13. 13. Outbox Pattern with Debezium https://red.ht/TryKafka
  14. 14. DEMO
  15. 15. ● GitHub repo: https://github.com/rhosak-debezium-outbox ● Try Kafka: https://red.ht/TryKafka ● RHOSAK YouTube playlist: https://www.youtube.com/playlist?list=PLf3vm0UK6HKqZ3Vi7h1Ynfbi0TpdXUr25 ● RHOSAK getting started blog post: https://developers.redhat.com/articles/2021/07/07/getting-started-red-hat-openshif t-streams-apache-kafka ● Debezium: https://debezium.io Resources https://red.ht/TryKafka

×