Testing SMTs? Testcontainers to the Rescue! with Fábio Sequeira & Mafalda Santos

Testing SMTs?
Testcontainers to the rescue!
Fábio Sequeira | Mafalda Santos
Marionete
2
© 2023 Marionete Limited
• Kafka Connect is a tool for scalability and reliable data transmission between Apache Kafka and other data systems.
• Kafka Connectors are ready-to-use components useful for importing and exporting data between Kafka topics and external
systems (e.g. databases).
Testing SMTs
Kafka Connect & Connectors
Source Connector Sink Connector
Data Source Data Sink
Kafka
3
© 2023 Marionete Limited
• Single Message Transforms (SMTs) are used to transform message values and keys.
• Just like the connectors, there are ready-made and easy-to-use SMTs available.
Testing SMTs
SMTs
Source Connector Sink Connector
Data Source Data Sink
Kafka
SMT SMT
4
© 2023 Marionete Limited
There are plenty of SMTs available but…
Testing SMTs
5
© 2023 Marionete Limited
There are plenty of SMTs available but…
Testing SMTs
Sometimes they are not enough.
6
© 2023 Marionete Limited
We need to build
CUSTOM SMTs!
Testing SMTs
7
© 2023 Marionete Limited
SMTs are relatively easy to build…
Testing SMTs
8
© 2023 Marionete Limited
SMTs are relatively easy to build…
Testing SMTs
9
© 2023 Marionete Limited
Testing SMTs
Unit Tests Limitations
Data Source
Kafka Cluster
Kafka Connect
Connector
Instance
SMTs
Converter
Connect
Record
10
© 2023 Marionete Limited
Testing SMTs
Unit Tests Limitations
Data Source
Kafka Cluster
Kafka Connect
Connector
Instance
SMTs
Converter
Connect
Record
Unit Tests
?
?
?
11
© 2023 Marionete Limited
Testing SMTs
Unit Tests Limitations
• Hard to understand how the Connector reads the value types coming from a DB:
• Often got data in different formats or patterns than we expected
• Different connectors could “read” data differently
12
© 2023 Marionete Limited
Testing SMTs
Unit Tests Limitations
• Hard to understand how the Connector reads the value types coming from a DB:
• Often got data in different formats or patterns than we expected
• Different connectors could “read” data differently
• Certain DB-specific types are very difficult to define in unit tests:
• SQL Server types: Datetime2? Datetimeoffset?
13
© 2023 Marionete Limited
Testing SMTs
Unit Tests Limitations
• Hard to understand how the Connector reads the value types coming from a DB:
• Often got data in different formats or patterns than we expected
• Different connectors could “read” data differently
• Certain DB-specific types are very difficult to define in unit tests:
• SQL Server types: Datetime2? Datetimeoffset?
• Difficult to map/identify specific topic schema fields:
• Protobuf schemas: optional/oneof fields
14
© 2023 Marionete Limited
How to complement the SMTs tests with a more robust and effective test?
Testing SMTs
15
© 2023 Marionete Limited
How to complement the SMTs tests with a more robust and effective test?
Testing SMTs
© 2023 Marionete Limited
Testcontainers
to the rescue!
17
© 2023 Marionete Limited
Testcontainers to the rescue!
Testcontainers
• Set up, configure, and run Docker
containers
• Simplify integration testing
• Available in multiple languages
• (We used Java)
18
© 2023 Marionete Limited
Testcontainers to the rescue!
Containers for SMT Testing
• The Testcontainers library has a module for Kafka
• Example:
KafkaContainer kafka = new KafkaContainer(
DockerImageName.parse("confluentinc/cp-kafka:7.3.2")
)
• There are also modules for various types of databases.
• For our use cases, we used a Container object based on the Testcontainers MS SQL Server module.
• But we needed more than that…
19
© 2023 Marionete Limited
Testcontainers to the rescue!
Custom Testcontainer Library
KafkaContainer SchemaRegistryContainer ConnectContainer MsSqlServerContainer
20
© 2023 Marionete Limited
Testcontainers to the rescue!
SchemaRegistryContainer
public final class SchemaRegistryContainer extends GenericContainer<SchemaRegistryContainer> {
public SchemaRegistryContainer(DockerImageName image) {
super(image);
this.addExposedPorts(8081);
this.withEnv("SCHEMA_REGISTRY_HOST_NAME", this.getHost());
}
public SchemaRegistryContainer withKafka(KafkaContainer kafkaContainer) {
this.withNetwork(kafkaContainer.getNetwork());
this.withEnv(
"SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS",
"PLAINTEXT://" + kafkaContainer.getNetworkAliases().get(0) + ":9092"
);
return this;
}
}
21
© 2023 Marionete Limited
Testcontainers to the rescue!
SchemaRegistryContainer (cont)
public SchemaRegistryContainer setupSchemaRegistryContainer(
KafkaContainer kafkaContainer,
String alias,
String confluentVersion
) {
return new SchemaRegistryContainer(
DockerImageName
.parse("confluentinc/cp-schema-registry:" + confluentVersion))
.withNetworkAliases(alias)
.withKafka(kafkaContainer);
}
22
© 2023 Marionete Limited
Testcontainers to the rescue!
ConnectContainer
public final class ConnectContainer extends GenericContainer<ConnectContainer> {
public ConnectContainer(ImageFromDockerfile image) {
super(image);
this.addExposedPorts(8083);
this.withEnv("CONNECT_GROUP_ID", "testcontainer-connect-group");
this.withEnv("CONNECT_CONFIG_STORAGE_TOPIC", "connect-config");
this.withEnv("CONNECT_OFFSET_STORAGE_TOPIC", "connect-offsets");
this.withEnv("CONNECT_STATUS_STORAGE_TOPIC", "connect-status");
this.withEnv("CONNECT_REST_ADVERTISED_HOST_NAME", this.getHost());
this.withEnv(
"CONNECT_PLUGIN_PATH",
"/usr/share/java, /usr/share/confluent-hub-components/"
);
// ...
}
public ConnectContainer withKafka(KafkaContainer kafka, SchemaRegistryContainer registry) {
this.withNetwork(kafka.getNetwork());
this.withEnv(
"CONNECT_BOOTSTRAP_SERVERS",
"PLAINTEXT://" + kafka.getNetworkAliases().get(0) + ":9092”
);
this.dependsOn(kafka, registry);
return this;
}
}
23
© 2023 Marionete Limited
Testcontainers to the rescue!
ConnectContainer (cont)
public ConnectContainer setupConnectContainer(
KafkaContainer kafkaContainer,
SchemaRegistryContainer registryContainer,
String alias,
String confluentVersion
) {
return new ConnectContainer(
new ImageFromDockerfile().withDockerfileFromBuilder(
dockerfileBuilder -> dockerfileBuilder
.from("confluentinc/cp-kafka-connect:" + confluentVersion)
.run("/bin/bash", "-c", "confluent-hub install --no-prompt confluentinc/kafka-connect-jdbc:10.6.0")
.build()
)
)
.withKafka(kafkaContainer, registryContainer)
.withNetworkAliases(alias);
}
24
© 2023 Marionete Limited
Testcontainers to the rescue!
Custom Library Example Helper Methods
• createTopic() • registerSchema() • installSMT()
• registerConnector()
• runSQLFile()
KafkaContainer SchemaRegistryContainer ConnectContainer MsSqlServerContainer
25
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
26
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Custom SMT
Generate jar
27
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Custom SMT
28
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
Custom SMT
29
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
Custom SMT
30
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Custom SMT
31
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Consumer
Custom SMT
32
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Consumer
test-source-connector
Custom SMT
33
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Consumer
test-source-connector
Custom SMT
{
"name": "test-source-connector",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"key.converter":
"org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema:8081",
"value.converter.auto.register.schemas": "false",
"topic.prefix": "output-topic",
"connection.url":
"jdbc:sqlserver://mssql:1433;databaseName=TestDB;(...)",
"table.whitelist": "Input_Table",
// ...
"transforms": "myCustomSMT",
"transforms.myCustomSMT.type": "org.example.MyCustomSMT$Value",
"transforms.myCustomSMT.targetFields": "field1,field2"
}
}
34
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Consumer
test-source-connector
Input records
Custom SMT
35
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Source Connectors
KafkaContainer SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer
MsSqlServerContainer
Input table
output-topic
output-topic-value
Consumer
test-source-connector
Input records
Compare retrieved records
with expected records
Custom SMT
36
© 2023 Marionete Limited
Testcontainers to the rescue!
Test Setup – Sink Connectors
KafkaContainer
SchemaRegistryContainer
Installed plugins:
• JDBC connector
• Custom SMT
ConnectContainer MsSqlServerContainer
Output table
input-topic
input-topic-value
Producer
test-sink-connector
Compare retrieved records
with expected records
37
© 2023 Marionete Limited
Thank you!
© 2023 Marionete Limited
Contacts
www.marionete.co.uk
FÁBIO SEQUEIRA
fabiosequeira
MAFALDA SANTOS
mafaldajsantos
solutions@marionete.co.uk
@marionete_io
https://www.linkedin.com/company/marionete
Core Technology
Specialist
Core Technology
Specialist
Testing SMTs? Testcontainers to the Rescue! with Fábio Sequeira & Mafalda Santos
1 de 39

Recomendados

Kubernetes 101 for Beginners por
Kubernetes 101 for BeginnersKubernetes 101 for Beginners
Kubernetes 101 for BeginnersOktay Esgul
1.1K vistas76 diapositivas
Implementing Domain Events with Kafka por
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with KafkaAndrei Rugina
1K vistas29 diapositivas
Build cloud native solution using open source por
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source Nitesh Jadhav
99 vistas29 diapositivas
Developing Realtime Data Pipelines With Apache Kafka por
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
4.9K vistas40 diapositivas
Laporan Praktikum Keamanan Siber - Tugas 8 -Kelas C - Kelompok 3.pdf por
Laporan Praktikum Keamanan Siber - Tugas 8 -Kelas C - Kelompok 3.pdfLaporan Praktikum Keamanan Siber - Tugas 8 -Kelas C - Kelompok 3.pdf
Laporan Praktikum Keamanan Siber - Tugas 8 -Kelas C - Kelompok 3.pdfIGedeArieYogantaraSu
38 vistas11 diapositivas
Changing landscapes in data integration - Kafka Connect for near real-time da... por
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...HostedbyConfluent
675 vistas17 diapositivas

Más contenido relacionado

Similar a Testing SMTs? Testcontainers to the Rescue! with Fábio Sequeira & Mafalda Santos

Docker Swarm secrets for creating great FIWARE platforms por
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
300 vistas35 diapositivas
OpenStack Magnum 2016-08-04 por
OpenStack Magnum 2016-08-04OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04Adrian Otto
1.5K vistas43 diapositivas
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ... por
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Flink Forward
150 vistas28 diapositivas
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms por
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE
192 vistas37 diapositivas
Real time data pipline with kafka streams por
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streamsYoni Farin
86 vistas25 diapositivas
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning por
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
1.6K vistas110 diapositivas

Similar a Testing SMTs? Testcontainers to the Rescue! with Fábio Sequeira & Mafalda Santos(20)

OpenStack Magnum 2016-08-04 por Adrian Otto
OpenStack Magnum 2016-08-04OpenStack Magnum 2016-08-04
OpenStack Magnum 2016-08-04
Adrian Otto1.5K vistas
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ... por Flink Forward
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Flink Forward150 vistas
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms por FIWARE
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE192 vistas
Real time data pipline with kafka streams por Yoni Farin
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin86 vistas
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning por Guido Schmutz
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz1.6K vistas
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka por Guido Schmutz
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Guido Schmutz1.2K vistas
Making Apache Kafka Elastic with Apache Mesos por Joe Stein
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
Joe Stein7.9K vistas
ACRN Kata Container on ACRN por Project ACRN
ACRN Kata Container on ACRNACRN Kata Container on ACRN
ACRN Kata Container on ACRN
Project ACRN415 vistas
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic... por Alex Maclinovsky
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Three Degrees of Mediation: Challenges and Lessons in building Cloud-agnostic...
Alex Maclinovsky1.6K vistas
Apache Tomcat 7 by Filip Hanik por Edgar Espina
Apache Tomcat 7 by Filip HanikApache Tomcat 7 by Filip Hanik
Apache Tomcat 7 by Filip Hanik
Edgar Espina2.1K vistas
Walking through the Spring Stack for Apache Kafka with Soby Chacko | Kafka S... por HostedbyConfluent
 Walking through the Spring Stack for Apache Kafka with Soby Chacko | Kafka S... Walking through the Spring Stack for Apache Kafka with Soby Chacko | Kafka S...
Walking through the Spring Stack for Apache Kafka with Soby Chacko | Kafka S...
HostedbyConfluent620 vistas
Monitoring Akka with Kamon 1.0 por Steffen Gebert
Monitoring Akka with Kamon 1.0Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0
Steffen Gebert2.4K vistas
Partner Development Guide for Kafka Connect por confluent
Partner Development Guide for Kafka ConnectPartner Development Guide for Kafka Connect
Partner Development Guide for Kafka Connect
confluent1.8K vistas
OSS Japan 2019 service mesh bridging Kubernetes and legacy por Steve Wong
OSS Japan 2019 service mesh bridging Kubernetes and legacyOSS Japan 2019 service mesh bridging Kubernetes and legacy
OSS Japan 2019 service mesh bridging Kubernetes and legacy
Steve Wong274 vistas
Network Design patters with Docker por Daniel Finneran
Network Design patters with DockerNetwork Design patters with Docker
Network Design patters with Docker
Daniel Finneran74 vistas
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail... por Alexey Bokov
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Alexey Bokov1.6K vistas
9th docker meetup 2016.07.13 por Amrita Prasad
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13
Amrita Prasad285 vistas

Más de HostedbyConfluent

Build Real-time Machine Learning Apps on Generative AI with Kafka Streams por
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsBuild Real-time Machine Learning Apps on Generative AI with Kafka Streams
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsHostedbyConfluent
88 vistas26 diapositivas
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ... por
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...HostedbyConfluent
52 vistas84 diapositivas
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ... por
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...HostedbyConfluent
82 vistas97 diapositivas
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern... por
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...HostedbyConfluent
64 vistas15 diapositivas
Rule Based Asset Management Workflow Automation at Netflix por
Rule Based Asset Management Workflow Automation at NetflixRule Based Asset Management Workflow Automation at Netflix
Rule Based Asset Management Workflow Automation at NetflixHostedbyConfluent
41 vistas56 diapositivas
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML... por
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...HostedbyConfluent
71 vistas32 diapositivas

Más de HostedbyConfluent(20)

Build Real-time Machine Learning Apps on Generative AI with Kafka Streams por HostedbyConfluent
Build Real-time Machine Learning Apps on Generative AI with Kafka StreamsBuild Real-time Machine Learning Apps on Generative AI with Kafka Streams
Build Real-time Machine Learning Apps on Generative AI with Kafka Streams
HostedbyConfluent88 vistas
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ... por HostedbyConfluent
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
When Only the Last Writer Wins We All Lose: Active-Active Geo-Replication in ...
HostedbyConfluent52 vistas
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ... por HostedbyConfluent
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
Apache Kafka's Next-Gen Rebalance Protocol: Towards More Stable and Scalable ...
HostedbyConfluent82 vistas
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern... por HostedbyConfluent
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
HostedbyConfluent64 vistas
Rule Based Asset Management Workflow Automation at Netflix por HostedbyConfluent
Rule Based Asset Management Workflow Automation at NetflixRule Based Asset Management Workflow Automation at Netflix
Rule Based Asset Management Workflow Automation at Netflix
HostedbyConfluent41 vistas
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML... por HostedbyConfluent
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
Scalable E-Commerce Data Pipelines with Kafka: Real-Time Analytics, Batch, ML...
HostedbyConfluent71 vistas
Indeed Flex: The Story of a Revolutionary Recruitment Platform por HostedbyConfluent
Indeed Flex: The Story of a Revolutionary Recruitment PlatformIndeed Flex: The Story of a Revolutionary Recruitment Platform
Indeed Flex: The Story of a Revolutionary Recruitment Platform
HostedbyConfluent40 vistas
Forecasting Kafka Lag Issues with Machine Learning por HostedbyConfluent
Forecasting Kafka Lag Issues with Machine LearningForecasting Kafka Lag Issues with Machine Learning
Forecasting Kafka Lag Issues with Machine Learning
HostedbyConfluent31 vistas
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U... por HostedbyConfluent
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...
Getting Under the Hood of Kafka Streams: Optimizing Storage Engines to Tune U...
HostedbyConfluent42 vistas
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre... por HostedbyConfluent
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...
Maximizing Real-Time Data Processing with Apache Kafka and InfluxDB: A Compre...
HostedbyConfluent45 vistas
Accelerating Path to Production for Generative AI-powered Applications por HostedbyConfluent
Accelerating Path to Production for Generative AI-powered ApplicationsAccelerating Path to Production for Generative AI-powered Applications
Accelerating Path to Production for Generative AI-powered Applications
HostedbyConfluent74 vistas
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited... por HostedbyConfluent
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...
Optimize Costs and Scale Your Streaming Applications with Virtually Unlimited...
HostedbyConfluent42 vistas
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad... por HostedbyConfluent
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...
Don’t Let Degradation Bring You Down: Automatically Detect & Remediate Degrad...
HostedbyConfluent58 vistas
Go Big or Go Home: Approaching Kafka Replication at Scale por HostedbyConfluent
Go Big or Go Home: Approaching Kafka Replication at ScaleGo Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at Scale
HostedbyConfluent39 vistas
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2 por HostedbyConfluent
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
HostedbyConfluent37 vistas
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid por HostedbyConfluent
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and DruidA Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
HostedbyConfluent92 vistas
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python por HostedbyConfluent
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark PythonFrom Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python
From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python
HostedbyConfluent86 vistas
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite... por HostedbyConfluent
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...
Beyond Monoliths: Thrivent’s Lessons in Building a Modern Integration Archite...
HostedbyConfluent66 vistas
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K... por HostedbyConfluent
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
HostedbyConfluent82 vistas

Último

Telenity Solutions Brief por
Telenity Solutions BriefTelenity Solutions Brief
Telenity Solutions BriefMustafa Kuğu
14 vistas10 diapositivas
KubeConNA23 Recap.pdf por
KubeConNA23 Recap.pdfKubeConNA23 Recap.pdf
KubeConNA23 Recap.pdfMichaelOLeary82
24 vistas27 diapositivas
CryptoBotsAI por
CryptoBotsAICryptoBotsAI
CryptoBotsAIchandureddyvadala199
42 vistas5 diapositivas
The Power of Generative AI in Accelerating No Code Adoption.pdf por
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfSaeed Al Dhaheri
39 vistas18 diapositivas
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 por
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023BookNet Canada
44 vistas19 diapositivas
Netmera Presentation.pdf por
Netmera Presentation.pdfNetmera Presentation.pdf
Netmera Presentation.pdfMustafa Kuğu
22 vistas50 diapositivas

Último(20)

The Power of Generative AI in Accelerating No Code Adoption.pdf por Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri39 vistas
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 por BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 vistas
Cocktail of Environments. How to Mix Test and Development Environments and St... por Aleksandr Tarasov
Cocktail of Environments. How to Mix Test and Development Environments and St...Cocktail of Environments. How to Mix Test and Development Environments and St...
Cocktail of Environments. How to Mix Test and Development Environments and St...
Aleksandr Tarasov23 vistas
Business Analyst Series 2023 - Week 4 Session 8 por DianaGray10
Business Analyst Series 2023 -  Week 4 Session 8Business Analyst Series 2023 -  Week 4 Session 8
Business Analyst Series 2023 - Week 4 Session 8
DianaGray10145 vistas
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... por ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue199 vistas
AIM102-S_Cognizant_CognizantCognitive por PhilipBasford
AIM102-S_Cognizant_CognizantCognitiveAIM102-S_Cognizant_CognizantCognitive
AIM102-S_Cognizant_CognizantCognitive
PhilipBasford21 vistas
Digital Personal Data Protection (DPDP) Practical Approach For CISOs por Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash162 vistas
LLMs in Production: Tooling, Process, and Team Structure por Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage57 vistas
PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」 por PC Cluster Consortium
PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」
PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」
"Surviving highload with Node.js", Andrii Shumada por Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays58 vistas
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdf por MichaelOLeary82
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdfAdopting Karpenter for Cost and Simplicity at Grafana Labs.pdf
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdf
MichaelOLeary8213 vistas
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf por ThomasBronack
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdfBronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf
ThomasBronack31 vistas
The Power of Heat Decarbonisation Plans in the Built Environment por IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE84 vistas
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... por ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue108 vistas
NTGapps NTG LowCode Platform por Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu437 vistas
Discover Aura Workshop (12.5.23).pdf por Neo4j
Discover Aura Workshop (12.5.23).pdfDiscover Aura Workshop (12.5.23).pdf
Discover Aura Workshop (12.5.23).pdf
Neo4j15 vistas

Testing SMTs? Testcontainers to the Rescue! with Fábio Sequeira & Mafalda Santos

  • 1. Testing SMTs? Testcontainers to the rescue! Fábio Sequeira | Mafalda Santos Marionete
  • 2. 2 © 2023 Marionete Limited • Kafka Connect is a tool for scalability and reliable data transmission between Apache Kafka and other data systems. • Kafka Connectors are ready-to-use components useful for importing and exporting data between Kafka topics and external systems (e.g. databases). Testing SMTs Kafka Connect & Connectors Source Connector Sink Connector Data Source Data Sink Kafka
  • 3. 3 © 2023 Marionete Limited • Single Message Transforms (SMTs) are used to transform message values and keys. • Just like the connectors, there are ready-made and easy-to-use SMTs available. Testing SMTs SMTs Source Connector Sink Connector Data Source Data Sink Kafka SMT SMT
  • 4. 4 © 2023 Marionete Limited There are plenty of SMTs available but… Testing SMTs
  • 5. 5 © 2023 Marionete Limited There are plenty of SMTs available but… Testing SMTs Sometimes they are not enough.
  • 6. 6 © 2023 Marionete Limited We need to build CUSTOM SMTs! Testing SMTs
  • 7. 7 © 2023 Marionete Limited SMTs are relatively easy to build… Testing SMTs
  • 8. 8 © 2023 Marionete Limited SMTs are relatively easy to build… Testing SMTs
  • 9. 9 © 2023 Marionete Limited Testing SMTs Unit Tests Limitations Data Source Kafka Cluster Kafka Connect Connector Instance SMTs Converter Connect Record
  • 10. 10 © 2023 Marionete Limited Testing SMTs Unit Tests Limitations Data Source Kafka Cluster Kafka Connect Connector Instance SMTs Converter Connect Record Unit Tests ? ? ?
  • 11. 11 © 2023 Marionete Limited Testing SMTs Unit Tests Limitations • Hard to understand how the Connector reads the value types coming from a DB: • Often got data in different formats or patterns than we expected • Different connectors could “read” data differently
  • 12. 12 © 2023 Marionete Limited Testing SMTs Unit Tests Limitations • Hard to understand how the Connector reads the value types coming from a DB: • Often got data in different formats or patterns than we expected • Different connectors could “read” data differently • Certain DB-specific types are very difficult to define in unit tests: • SQL Server types: Datetime2? Datetimeoffset?
  • 13. 13 © 2023 Marionete Limited Testing SMTs Unit Tests Limitations • Hard to understand how the Connector reads the value types coming from a DB: • Often got data in different formats or patterns than we expected • Different connectors could “read” data differently • Certain DB-specific types are very difficult to define in unit tests: • SQL Server types: Datetime2? Datetimeoffset? • Difficult to map/identify specific topic schema fields: • Protobuf schemas: optional/oneof fields
  • 14. 14 © 2023 Marionete Limited How to complement the SMTs tests with a more robust and effective test? Testing SMTs
  • 15. 15 © 2023 Marionete Limited How to complement the SMTs tests with a more robust and effective test? Testing SMTs
  • 16. © 2023 Marionete Limited Testcontainers to the rescue!
  • 17. 17 © 2023 Marionete Limited Testcontainers to the rescue! Testcontainers • Set up, configure, and run Docker containers • Simplify integration testing • Available in multiple languages • (We used Java)
  • 18. 18 © 2023 Marionete Limited Testcontainers to the rescue! Containers for SMT Testing • The Testcontainers library has a module for Kafka • Example: KafkaContainer kafka = new KafkaContainer( DockerImageName.parse("confluentinc/cp-kafka:7.3.2") ) • There are also modules for various types of databases. • For our use cases, we used a Container object based on the Testcontainers MS SQL Server module. • But we needed more than that…
  • 19. 19 © 2023 Marionete Limited Testcontainers to the rescue! Custom Testcontainer Library KafkaContainer SchemaRegistryContainer ConnectContainer MsSqlServerContainer
  • 20. 20 © 2023 Marionete Limited Testcontainers to the rescue! SchemaRegistryContainer public final class SchemaRegistryContainer extends GenericContainer<SchemaRegistryContainer> { public SchemaRegistryContainer(DockerImageName image) { super(image); this.addExposedPorts(8081); this.withEnv("SCHEMA_REGISTRY_HOST_NAME", this.getHost()); } public SchemaRegistryContainer withKafka(KafkaContainer kafkaContainer) { this.withNetwork(kafkaContainer.getNetwork()); this.withEnv( "SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS", "PLAINTEXT://" + kafkaContainer.getNetworkAliases().get(0) + ":9092" ); return this; } }
  • 21. 21 © 2023 Marionete Limited Testcontainers to the rescue! SchemaRegistryContainer (cont) public SchemaRegistryContainer setupSchemaRegistryContainer( KafkaContainer kafkaContainer, String alias, String confluentVersion ) { return new SchemaRegistryContainer( DockerImageName .parse("confluentinc/cp-schema-registry:" + confluentVersion)) .withNetworkAliases(alias) .withKafka(kafkaContainer); }
  • 22. 22 © 2023 Marionete Limited Testcontainers to the rescue! ConnectContainer public final class ConnectContainer extends GenericContainer<ConnectContainer> { public ConnectContainer(ImageFromDockerfile image) { super(image); this.addExposedPorts(8083); this.withEnv("CONNECT_GROUP_ID", "testcontainer-connect-group"); this.withEnv("CONNECT_CONFIG_STORAGE_TOPIC", "connect-config"); this.withEnv("CONNECT_OFFSET_STORAGE_TOPIC", "connect-offsets"); this.withEnv("CONNECT_STATUS_STORAGE_TOPIC", "connect-status"); this.withEnv("CONNECT_REST_ADVERTISED_HOST_NAME", this.getHost()); this.withEnv( "CONNECT_PLUGIN_PATH", "/usr/share/java, /usr/share/confluent-hub-components/" ); // ... } public ConnectContainer withKafka(KafkaContainer kafka, SchemaRegistryContainer registry) { this.withNetwork(kafka.getNetwork()); this.withEnv( "CONNECT_BOOTSTRAP_SERVERS", "PLAINTEXT://" + kafka.getNetworkAliases().get(0) + ":9092” ); this.dependsOn(kafka, registry); return this; } }
  • 23. 23 © 2023 Marionete Limited Testcontainers to the rescue! ConnectContainer (cont) public ConnectContainer setupConnectContainer( KafkaContainer kafkaContainer, SchemaRegistryContainer registryContainer, String alias, String confluentVersion ) { return new ConnectContainer( new ImageFromDockerfile().withDockerfileFromBuilder( dockerfileBuilder -> dockerfileBuilder .from("confluentinc/cp-kafka-connect:" + confluentVersion) .run("/bin/bash", "-c", "confluent-hub install --no-prompt confluentinc/kafka-connect-jdbc:10.6.0") .build() ) ) .withKafka(kafkaContainer, registryContainer) .withNetworkAliases(alias); }
  • 24. 24 © 2023 Marionete Limited Testcontainers to the rescue! Custom Library Example Helper Methods • createTopic() • registerSchema() • installSMT() • registerConnector() • runSQLFile() KafkaContainer SchemaRegistryContainer ConnectContainer MsSqlServerContainer
  • 25. 25 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer
  • 26. 26 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Custom SMT Generate jar
  • 27. 27 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Custom SMT
  • 28. 28 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table Custom SMT
  • 29. 29 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic Custom SMT
  • 30. 30 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Custom SMT
  • 31. 31 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Consumer Custom SMT
  • 32. 32 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Consumer test-source-connector Custom SMT
  • 33. 33 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Consumer test-source-connector Custom SMT { "name": "test-source-connector", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "tasks.max": "1", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://schema:8081", "value.converter.auto.register.schemas": "false", "topic.prefix": "output-topic", "connection.url": "jdbc:sqlserver://mssql:1433;databaseName=TestDB;(...)", "table.whitelist": "Input_Table", // ... "transforms": "myCustomSMT", "transforms.myCustomSMT.type": "org.example.MyCustomSMT$Value", "transforms.myCustomSMT.targetFields": "field1,field2" } }
  • 34. 34 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Consumer test-source-connector Input records Custom SMT
  • 35. 35 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Source Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Input table output-topic output-topic-value Consumer test-source-connector Input records Compare retrieved records with expected records Custom SMT
  • 36. 36 © 2023 Marionete Limited Testcontainers to the rescue! Test Setup – Sink Connectors KafkaContainer SchemaRegistryContainer Installed plugins: • JDBC connector • Custom SMT ConnectContainer MsSqlServerContainer Output table input-topic input-topic-value Producer test-sink-connector Compare retrieved records with expected records
  • 37. 37 © 2023 Marionete Limited Thank you!
  • 38. © 2023 Marionete Limited Contacts www.marionete.co.uk FÁBIO SEQUEIRA fabiosequeira MAFALDA SANTOS mafaldajsantos solutions@marionete.co.uk @marionete_io https://www.linkedin.com/company/marionete Core Technology Specialist Core Technology Specialist