SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
V0000000
1
Friends don’t let friends do dual-writes!
Introducing Change Data
Capture with Debezium
Cheng Kuan Gan
Senior Specialist Solution Architect
Red Hat APAC
CHANGE
DATA
CAPTURE
V0000000
CHANGE DATA CAPTURE
2
The Issue with Dual Writes
Source:
What's the problem?
Change data capture to the rescue!
CDC Use Cases & Patterns
Replication
Audit Logs
Microservices
Practical Matters
Deployment Topologies
Running on Kubernetes
Single Message Transforms
Agenda
V0000000
CHANGE DATA CAPTURE
Common Problem
3
Updating multiple resources
Order
Service
Database
V0000000
CHANGE DATA CAPTURE
Common Problem
4
Updating multiple resources
Order
Service
Database
Cache
V0000000
CHANGE DATA CAPTURE
Common Problem
5
Updating multiple resources
Order
Service
Database
Cache
Search Index
V0000000
CHANGE DATA CAPTURE
Common Problem
6
Updating multiple resources
Order
Service
Database
Cache
Search Index
V0000000
Friends Don't Let Friends
Do Dual Writes!
CHANGE
DATA
CAPTURE
7
V0000000
CHANGE DATA CAPTURE
Better Solution
8
Stream changes events from the database
Order
Service
V0000000
CHANGE DATA CAPTURE
Better Solution
9
Stream changes events from the database
Order
Service
C | C | U | C | U | U | D Change Data
Capture
C - Change
U - Update
D - Delete
V0000000
CHANGE DATA CAPTURE
Better Solution
10
Stream changes events from the database
Order
Service
C | C | U | C | U | U | D Change Data
Capture
C - Change
U - Update
D - Delete
Search Index Cache
V0000000
Change Data
Capture with
Debezium
CHANGE DATA CAPTURE
Debezium is an open
source distributed
platform for change data
capture
11
V0000000
CHANGE DATA CAPTURE
Debezium
12
Change Data Capture Platform
● CDC for multiple databases
○ Based on transaction logs
○ Snapshotting, Filtering etc.
● Fully open-source, very active community
● Latest version: 1.4
● Production deployments at multiple companies
(e.g. WePay, JW Player, Convoy, Trivago, OYO,
BlaBlaCar etc.)
V0000000
CHANGE DATA CAPTURE
Red Hat Integration CDC
13
● GA Connectors
○ MySQL
○ Postgres
○ SQL Server
○ MongoDB
○ DB2 (Linux only)
● Developer Preview:
○ Oracle 19 EE (LogMiner)
Supported Databases
V0000000
CHANGE DATA CAPTURE
Advantages of Log-based CDC
14
Tailing the Transaction Logs
● All data changes are captured
● No polling delay or overhead
● Transparent to writing applications and models
● Can capture deletes
● Can capture old record state and further meta data
V0000000
CHANGE DATA CAPTURE
Log vs Query based CDC
15
Query-based Log-based
All data changes are captured -
No polling delay or overhead -
Transparent to writing applications
and models -
Can capture deletes and old record
state -
Simple Installation/Configuration -
V0000000
CHANGE DATA CAPTURE
Debezium
16
Change Event Structure
● Key: PK of table
● Value: Describing the change event
○ Before state,
○ After state,
○ Metadata info
● Serialization formats:
○ JSON
○ Avro
● Cloud events could be used too
V0000000
CHANGE DATA CAPTURE
Single Message Transformations
17
Modify events before storing in Kafka
Image Source: “Penknife, Swiss Army Knife” by Emilian Robert Vicol , used under CC BY 2.0
● Lightweight single message inline transformation
● Format conversions
○ Time/date fields
○ Extract new row state
● Aggregate sharded tables to single topic
● Keep compatibility with existing consumers
● Transformation does not interact with external systems
V0000000
Change Data Capture
Uses & Patterns
CHANGE
DATA
CAPTURE
18
V0000000
CHANGE DATA CAPTURE
Data Replication
19
Zero-Code Streaming Pipelines
| | | | | | |  
| | | | | | |   |
| | | | | |
MySQL
PostgreSQL
Apache Kafka
V0000000
CHANGE DATA CAPTURE
Data Replication
20
Zero-Code Streaming Pipelines
| | | | | | |  
| | | | | | |   |
| | | | | |
MySQL
PostgreSQL
Apache Kafka
Kafka Connect Kafka Connect
V0000000
CHANGE DATA CAPTURE
Data Replication
21
Zero-Code Streaming Pipelines
| | | | | | |  
| | | | | | |   |
| | | | | |
MySQL
PostgreSQL
Apache Kafka
Kafka Connect Kafka Connect
DBZ PG
DBZ
MySQL
V0000000
CHANGE DATA CAPTURE
Data Replication
22
Zero-Code Streaming Pipelines
| | | | | | |  
| | | | | | |   |
| | | | | |
MySQL
PostgreSQL
Apache Kafka
Kafka Connect Kafka Connect
DBZ PG
DBZ
MySQL
ES
Connector
ElasticSearch
V0000000
CHANGE DATA CAPTURE
Data Replication
23
Zero-Code Streaming Pipelines
| | | | | | |  
| | | | | | |   |
| | | | | |
MySQL
PostgreSQL
Apache Kafka
Kafka Connect Kafka Connect
DBZ PG
DBZ
MySQL
ES
Connector ElasticSearch
SQL
Connector
Data
Warehouse
V0000000
CHANGE DATA CAPTURE
A Trucking Company Improves ELT Performance with Debezium
24
Source:
Logs & Offsets: (Near) Real Time ELT with Apache Kafka + Snowflake
Low Latency, Zero Data Loss and Low Maintenance are key to maintain the user
experience and data democratization
● The ELT system is not
able to scale when
employee growth
exceeded 700+.
● Data that used to take
10-15 minutes to import
now takes 1-2 hours.
● Some larger datasets
expects latency of 6+
hours.
Modernized ETL
improved significantly
with Debezium
V0000000
CHANGE DATA CAPTURE
Data Replication
25
Zero-Code Streaming Pipelines
Source:
Logs & Offsets: (Near) Real Time ELT with Apache Kafka + Snowflake
V0000000
CHANGE DATA CAPTURE
Auditing
26
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
| | | | | | |   |
DBZ
CRM Service
Source DB
Kafka Connect
Apache Kafka
V0000000
CHANGE DATA CAPTURE
Auditing
27
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
| | | | | | |   |
DBZ
CRM Service
Source DB
Kafka Connect
Apache Kafka
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
V0000000
CHANGE DATA CAPTURE
Auditing
28
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
| | | | | | |   |
DBZ
CRM Service
Source DB
Kafka Connect
Apache Kafka
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
Customer Events
| | | | | |
Transactions
V0000000
CHANGE DATA CAPTURE
Auditing
29
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
| | | | | | |   |
DBZ
CRM Service
Source DB
Kafka Connect
Apache Kafka
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
Customer Events
| | | | | |
Transactions
Kafka Streams
V0000000
CHANGE DATA CAPTURE
Auditing
30
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
| | | | | | |   |
DBZ
CRM Service
Source DB
Kafka Connect
Apache Kafka
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
Customer Events
| | | | | |
Transactions
Kafka Streams
| | | | | | |   |
Enriched Customers
V0000000
CHANGE DATA CAPTURE
Auditing
31
CDC and a bit of Kafka Streams
Source: http://bit.ly/debezium-auditlogs
V0000000
CHANGE DATA CAPTURE
Microservices
32
Microservices Data Exchange
Source:
● Propagate data between different
services without coupling
● Each service keeps optimised views
locally
V0000000
CHANGE DATA CAPTURE
Microservices
33
Outbox Pattern
Source: http://bit.ly/debezium-outbox-pattern
V0000000
CHANGE DATA CAPTURE
Microservices
34
Mono to micro: Strangler Pattern
Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0
● Extract microservice for single component(s)
● Keep write requests against running monolith
● Stream changes to extracted microservice
● Test new functionality
● Switch over, evolve schema only afterwards
V0000000
CHANGE DATA CAPTURE
Mono to micro: Strangler Pattern
35
Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0
Customer
V0000000
CHANGE DATA CAPTURE
Mono to micro: Strangler Pattern
36
Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0
Customer Customer’
Router
CDC
Transformation
Reads /
Writes Reads
V0000000
CHANGE DATA CAPTURE
Mono to micro: Strangler Pattern
37
Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0
Customer
Router
CDC
Reads /
Writes
Reads /
Writes
CDC
V0000000
Demo
CHANGE
DATA
CAPTURE
38
V0000000
Demo
39
Kafka
Connect
Kafka
Connect
Apache Kafka
V0000000
Running on
OpenShift
CHANGE DATA CAPTURE
Getting the best
cloud-native Apache
Kafka running on
enterprise Kubernetes
40
V0000000
CHANGE DATA CAPTURE
Running on OpenShift
41
Cloud-native Apache Kafka
Source:
● Provides:
○ Container images for Apache Kafka, Connect, Zookeeper and
MirrorMaker
○ Kubernetes Operators for managing/configuring Apache Kafka
clusters, topics and users
○ Kafka Consumer, Producer and Admin clients, Kafka Streams
● Upstream Community: Strimzi
V0000000
CHANGE DATA CAPTURE
Running on OpenShift
42
Deployment via Operators
Source:
● YAML-based custom resource definitions for
Kafka/Connect clusters, topics etc.
● Operator applies configuration
● Advantages
○ Automated deployment and scaling
○ Simplified upgrading
○ Portability across clouds
V0000000
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHat
43
Red Hat is the world’s leading provider of enterprise
open source software solutions. Award-winning support,
training, and consulting services make Red Hat a trusted
adviser to the Fortune 500.
Thank you
Optional
section
marker
or
title

Más contenido relacionado

La actualidad más candente

Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...confluent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfAlkin Tezuysal
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...Databricks
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETconfluent
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialJean-François Gagné
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialColin Charles
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouseAltinity Ltd
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsHostedbyConfluent
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
 

La actualidad más candente (20)

Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability Tutorial
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 

Similar a Introducing Change Data Capture with Debezium

Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
Databus - Abhishek Bhargava & Maheswaran Veluchamy - DevOps Bangalore Meetup...
Databus - Abhishek Bhargava &  Maheswaran Veluchamy - DevOps Bangalore Meetup...Databus - Abhishek Bhargava &  Maheswaran Veluchamy - DevOps Bangalore Meetup...
Databus - Abhishek Bhargava & Maheswaran Veluchamy - DevOps Bangalore Meetup...DevOpsBangalore
 
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex Generation
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex GenerationCisco Centro de Datos de proxima generación, Cisco Data Center Nex Generation
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex GenerationSuministros Obras y Sistemas
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshopconfluent
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAconfluent
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloudconfluent
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Dmitry Skaredov
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshopChristina Lin
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Edwin Poot
 
Study Notes - Event-Driven Data Management for Microservices
Study Notes - Event-Driven Data Management for MicroservicesStudy Notes - Event-Driven Data Management for Microservices
Study Notes - Event-Driven Data Management for MicroservicesRick Hwang
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
 
#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout1CloudRoad.com
 
DISTRIBUTED CONTROL SYSTEMS BASICS.
DISTRIBUTED  CONTROL     SYSTEMS  BASICS.    DISTRIBUTED  CONTROL     SYSTEMS  BASICS.
DISTRIBUTED CONTROL SYSTEMS BASICS. Ashok Kumar Barla
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
 
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitAmazon Web Services
 

Similar a Introducing Change Data Capture with Debezium (20)

Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
Databus - Abhishek Bhargava & Maheswaran Veluchamy - DevOps Bangalore Meetup...
Databus - Abhishek Bhargava &  Maheswaran Veluchamy - DevOps Bangalore Meetup...Databus - Abhishek Bhargava &  Maheswaran Veluchamy - DevOps Bangalore Meetup...
Databus - Abhishek Bhargava & Maheswaran Veluchamy - DevOps Bangalore Meetup...
 
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex Generation
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex GenerationCisco Centro de Datos de proxima generación, Cisco Data Center Nex Generation
Cisco Centro de Datos de proxima generación, Cisco Data Center Nex Generation
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshop
 
Confluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVAConfluent Partner Tech Talk with SVA
Confluent Partner Tech Talk with SVA
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloud
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
Study Notes - Event-Driven Data Management for Microservices
Study Notes - Event-Driven Data Management for MicroservicesStudy Notes - Event-Driven Data Management for Microservices
Study Notes - Event-Driven Data Management for Microservices
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout
 
DISTRIBUTED CONTROL SYSTEMS BASICS.
DISTRIBUTED  CONTROL     SYSTEMS  BASICS.    DISTRIBUTED  CONTROL     SYSTEMS  BASICS.
DISTRIBUTED CONTROL SYSTEMS BASICS.
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
 
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
 

Último

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Último (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Introducing Change Data Capture with Debezium

  • 1. V0000000 1 Friends don’t let friends do dual-writes! Introducing Change Data Capture with Debezium Cheng Kuan Gan Senior Specialist Solution Architect Red Hat APAC CHANGE DATA CAPTURE
  • 2. V0000000 CHANGE DATA CAPTURE 2 The Issue with Dual Writes Source: What's the problem? Change data capture to the rescue! CDC Use Cases & Patterns Replication Audit Logs Microservices Practical Matters Deployment Topologies Running on Kubernetes Single Message Transforms Agenda
  • 3. V0000000 CHANGE DATA CAPTURE Common Problem 3 Updating multiple resources Order Service Database
  • 4. V0000000 CHANGE DATA CAPTURE Common Problem 4 Updating multiple resources Order Service Database Cache
  • 5. V0000000 CHANGE DATA CAPTURE Common Problem 5 Updating multiple resources Order Service Database Cache Search Index
  • 6. V0000000 CHANGE DATA CAPTURE Common Problem 6 Updating multiple resources Order Service Database Cache Search Index
  • 7. V0000000 Friends Don't Let Friends Do Dual Writes! CHANGE DATA CAPTURE 7
  • 8. V0000000 CHANGE DATA CAPTURE Better Solution 8 Stream changes events from the database Order Service
  • 9. V0000000 CHANGE DATA CAPTURE Better Solution 9 Stream changes events from the database Order Service C | C | U | C | U | U | D Change Data Capture C - Change U - Update D - Delete
  • 10. V0000000 CHANGE DATA CAPTURE Better Solution 10 Stream changes events from the database Order Service C | C | U | C | U | U | D Change Data Capture C - Change U - Update D - Delete Search Index Cache
  • 11. V0000000 Change Data Capture with Debezium CHANGE DATA CAPTURE Debezium is an open source distributed platform for change data capture 11
  • 12. V0000000 CHANGE DATA CAPTURE Debezium 12 Change Data Capture Platform ● CDC for multiple databases ○ Based on transaction logs ○ Snapshotting, Filtering etc. ● Fully open-source, very active community ● Latest version: 1.4 ● Production deployments at multiple companies (e.g. WePay, JW Player, Convoy, Trivago, OYO, BlaBlaCar etc.)
  • 13. V0000000 CHANGE DATA CAPTURE Red Hat Integration CDC 13 ● GA Connectors ○ MySQL ○ Postgres ○ SQL Server ○ MongoDB ○ DB2 (Linux only) ● Developer Preview: ○ Oracle 19 EE (LogMiner) Supported Databases
  • 14. V0000000 CHANGE DATA CAPTURE Advantages of Log-based CDC 14 Tailing the Transaction Logs ● All data changes are captured ● No polling delay or overhead ● Transparent to writing applications and models ● Can capture deletes ● Can capture old record state and further meta data
  • 15. V0000000 CHANGE DATA CAPTURE Log vs Query based CDC 15 Query-based Log-based All data changes are captured - No polling delay or overhead - Transparent to writing applications and models - Can capture deletes and old record state - Simple Installation/Configuration -
  • 16. V0000000 CHANGE DATA CAPTURE Debezium 16 Change Event Structure ● Key: PK of table ● Value: Describing the change event ○ Before state, ○ After state, ○ Metadata info ● Serialization formats: ○ JSON ○ Avro ● Cloud events could be used too
  • 17. V0000000 CHANGE DATA CAPTURE Single Message Transformations 17 Modify events before storing in Kafka Image Source: “Penknife, Swiss Army Knife” by Emilian Robert Vicol , used under CC BY 2.0 ● Lightweight single message inline transformation ● Format conversions ○ Time/date fields ○ Extract new row state ● Aggregate sharded tables to single topic ● Keep compatibility with existing consumers ● Transformation does not interact with external systems
  • 18. V0000000 Change Data Capture Uses & Patterns CHANGE DATA CAPTURE 18
  • 19. V0000000 CHANGE DATA CAPTURE Data Replication 19 Zero-Code Streaming Pipelines | | | | | | |   | | | | | | |   | | | | | | | MySQL PostgreSQL Apache Kafka
  • 20. V0000000 CHANGE DATA CAPTURE Data Replication 20 Zero-Code Streaming Pipelines | | | | | | |   | | | | | | |   | | | | | | | MySQL PostgreSQL Apache Kafka Kafka Connect Kafka Connect
  • 21. V0000000 CHANGE DATA CAPTURE Data Replication 21 Zero-Code Streaming Pipelines | | | | | | |   | | | | | | |   | | | | | | | MySQL PostgreSQL Apache Kafka Kafka Connect Kafka Connect DBZ PG DBZ MySQL
  • 22. V0000000 CHANGE DATA CAPTURE Data Replication 22 Zero-Code Streaming Pipelines | | | | | | |   | | | | | | |   | | | | | | | MySQL PostgreSQL Apache Kafka Kafka Connect Kafka Connect DBZ PG DBZ MySQL ES Connector ElasticSearch
  • 23. V0000000 CHANGE DATA CAPTURE Data Replication 23 Zero-Code Streaming Pipelines | | | | | | |   | | | | | | |   | | | | | | | MySQL PostgreSQL Apache Kafka Kafka Connect Kafka Connect DBZ PG DBZ MySQL ES Connector ElasticSearch SQL Connector Data Warehouse
  • 24. V0000000 CHANGE DATA CAPTURE A Trucking Company Improves ELT Performance with Debezium 24 Source: Logs & Offsets: (Near) Real Time ELT with Apache Kafka + Snowflake Low Latency, Zero Data Loss and Low Maintenance are key to maintain the user experience and data democratization ● The ELT system is not able to scale when employee growth exceeded 700+. ● Data that used to take 10-15 minutes to import now takes 1-2 hours. ● Some larger datasets expects latency of 6+ hours. Modernized ETL improved significantly with Debezium
  • 25. V0000000 CHANGE DATA CAPTURE Data Replication 25 Zero-Code Streaming Pipelines Source: Logs & Offsets: (Near) Real Time ELT with Apache Kafka + Snowflake
  • 26. V0000000 CHANGE DATA CAPTURE Auditing 26 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs | | | | | | |   | DBZ CRM Service Source DB Kafka Connect Apache Kafka
  • 27. V0000000 CHANGE DATA CAPTURE Auditing 27 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs | | | | | | |   | DBZ CRM Service Source DB Kafka Connect Apache Kafka Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer
  • 28. V0000000 CHANGE DATA CAPTURE Auditing 28 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs | | | | | | |   | DBZ CRM Service Source DB Kafka Connect Apache Kafka Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer Customer Events | | | | | | Transactions
  • 29. V0000000 CHANGE DATA CAPTURE Auditing 29 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs | | | | | | |   | DBZ CRM Service Source DB Kafka Connect Apache Kafka Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer Customer Events | | | | | | Transactions Kafka Streams
  • 30. V0000000 CHANGE DATA CAPTURE Auditing 30 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs | | | | | | |   | DBZ CRM Service Source DB Kafka Connect Apache Kafka Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer Customer Events | | | | | | Transactions Kafka Streams | | | | | | |   | Enriched Customers
  • 31. V0000000 CHANGE DATA CAPTURE Auditing 31 CDC and a bit of Kafka Streams Source: http://bit.ly/debezium-auditlogs
  • 32. V0000000 CHANGE DATA CAPTURE Microservices 32 Microservices Data Exchange Source: ● Propagate data between different services without coupling ● Each service keeps optimised views locally
  • 33. V0000000 CHANGE DATA CAPTURE Microservices 33 Outbox Pattern Source: http://bit.ly/debezium-outbox-pattern
  • 34. V0000000 CHANGE DATA CAPTURE Microservices 34 Mono to micro: Strangler Pattern Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0 ● Extract microservice for single component(s) ● Keep write requests against running monolith ● Stream changes to extracted microservice ● Test new functionality ● Switch over, evolve schema only afterwards
  • 35. V0000000 CHANGE DATA CAPTURE Mono to micro: Strangler Pattern 35 Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0 Customer
  • 36. V0000000 CHANGE DATA CAPTURE Mono to micro: Strangler Pattern 36 Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0 Customer Customer’ Router CDC Transformation Reads / Writes Reads
  • 37. V0000000 CHANGE DATA CAPTURE Mono to micro: Strangler Pattern 37 Photo: “Strangler vines on trees, seen on the Mount Sorrow hike” by cynren, under CC BY SA 2.0 Customer Router CDC Reads / Writes Reads / Writes CDC
  • 40. V0000000 Running on OpenShift CHANGE DATA CAPTURE Getting the best cloud-native Apache Kafka running on enterprise Kubernetes 40
  • 41. V0000000 CHANGE DATA CAPTURE Running on OpenShift 41 Cloud-native Apache Kafka Source: ● Provides: ○ Container images for Apache Kafka, Connect, Zookeeper and MirrorMaker ○ Kubernetes Operators for managing/configuring Apache Kafka clusters, topics and users ○ Kafka Consumer, Producer and Admin clients, Kafka Streams ● Upstream Community: Strimzi
  • 42. V0000000 CHANGE DATA CAPTURE Running on OpenShift 42 Deployment via Operators Source: ● YAML-based custom resource definitions for Kafka/Connect clusters, topics etc. ● Operator applies configuration ● Advantages ○ Automated deployment and scaling ○ Simplified upgrading ○ Portability across clouds
  • 43. V0000000 linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat 43 Red Hat is the world’s leading provider of enterprise open source software solutions. Award-winning support, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. Thank you Optional section marker or title