Confluent Partner Tech Talk with SVA

@yourtwitterhandle | developer.confluent.io
What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..

applications)?
Starting soon…

Our Partner Technical Sales Enablement offering
Scheduled sessions On-demand
Join us for these live sessions
where our experts will guide you
through sessions of different level
and will be available to answer your
questions. Some examples of
sessions are below:
● Confluent 101: for new starters
● Workshops
● Path to production series
Learn the basics with a guided
experience, at your own pace with our
learning paths on-demand. You will
also find an always growing repository
of more advanced presentations to
dig-deeper. Some examples are below:
● Confluent 10
● Confluent Use Cases
● Positioning Confluent Value
● Confluent Cloud Networking
● … and many more
AskTheExpert /
Workshops
For selected partners, we’ll offer
additional support to:
● Technical Sales workshop
● JIT coaching on spotlight
opportunity
● Build CoE inside partners by
getting people with similar
interest together
● Solution discovery
● Tech Talk
● Q&A

applications)?

Goal
Partners Tech Talks are webinars where subject matter experts from a Partner talk about a
specific use case or project. The goal of Tech Talks is to provide best practices and
applications insights, along with inspiration, and help you stay up to date about innovations
in confluent ecosystem.

Conﬂuent Perspective on
Mainframe
6

Of the world’s
top 10 insurers
Of the
top 25 retailers
Of the
Fortune 500
Of the world’s
top 100 banks
Mainframes continue to power business critical
applications
92% 100% 72% 70%
*Skillsoft Report from Oct 2019
7

But they present a number of challenges
1. High, unpredictable costs
Mainframe data is expensive to access for
modern, real-time applications via traditional
methods (i.e. directly polling from an MQ). More
requests to the mainframe leads to higher costs.
Batch jobs & APIs
On-Prem
ETL App
Cloud
Legacy code
Much mainframe code is written in COBOL, a
now rare programming language. This means
updating or making to changes to mainframe
applications is expensive and time-consuming.
Complex business logic
Many business-critical mainframe apps have
been written with complex business logic
developed over decades. Making changes to
these apps is complicated and risky.
Mainframe
Application
Application
Application
Cloud Data
Warehouse
Database
8

Get the most from your mainframes with Conﬂuent
Bring real-time
access to
mainframes
Capture and continuously
stream mainframe data in
real time to power new
applications with minimal
latency.
Accelerate
application
development times
Equip your developers to
build state-of-the-art,
cloud-native applications
with instant access to
ready-to-use mainframe
data.
Increase the ROI of
your IBM zSystem
Redirect requests away
from mainframes and
achieve a signiﬁcant
reduction in MIPS and
CHINIT consumption
costs.
Future-proof your
architecture
Pave an incremental,
risk-free path towards
mainframe migration, and
avoid disrupting existing
mission-critical
applications.
9

Bring
real-time
access to
mainframes
Capture and
continuously stream
mainframe data in real
time
Break down data silos and enable the use of mainframe data for
real-time applications, without disruption to existing workloads
Mainframe On-premises database Cloud data warehouse
Fraud prevention engine
In-session web or app
personalization
Real-time analytics
Customer service
enablement
Inventory management

Mainframe ofﬂoading
architecture

Mainframe “Crash”
Course
1
2
zIIP
Always function at the full speed of the
processor and "do not count" in software
pricing calculations for eligible workloads
(specifically JAVA)..
MQ/CDC Workloads are zIIP eligible
Move qualified workloads via Confluent MQ
Connector run locally in zIIP space.

z/OS
CICS
IMS
VSAM
Legacy Apps
zIIP
MQ Connector
Unlocking Mainframe Data via MQ
13
● Publish to Conﬂuent to improve data reliability, accessibility, and
access to cloud services
● No changes to the existing mainframe applications
● Greatly reduce MQ related Channel Initiator (CHINIT) to move
data between the mainframe and cloud

IBM MQ Source / Sink on
z/OS Premium
Connectors
Allow customers to cost-effectively,
quickly & reliably move data between
Mainframes & Confluent
Reduce compute and networking
requirements that can add costs and
complexity, so that customers can
cost-effectively run their Connect
workloads on z/OS
Reduce data infrastructure TCO by
significantly bringing down compute
(MIPS) and networking costs on
Mainframes
Enhance data accessibility, portability,
and interoperability by integrating
Mainframes with Confluent and
unlocking its use for other apps & data
systems
Improve speed, latency, and
concurrency by moving from network
transfer to in-memory transfer

z/OS
zIIP
CDC Connector
Unlocking Mainframe Data via DB2 & CDC
15
● Publish to Conﬂuent to improve data reliability, accessibility, and
access to cloud services
● No changes to the existing mainframe applications
● Many different CDCs: IBM IIDR, Oracle Golden Gate, Informatica,
Qlik, tcVision, ecc.
CICS
IMS
VSAM
Legacy Apps

Mainframe ofﬂoading customers

Customers trust Confluent to connect to IBM
Mainframes
17
Saved on costs, stayed
compliant and reimagined
customer experiences
“… rescue data off of the
mainframe, … saving RBC fixed
infrastructure costs (OPEX). RBC
stayed compliant with bank
regulations and business logic,
and is now able to create new
applications using the same
event-based architecture.”
Case study
Reduced demand on
mainframes and accelerated
the delivery of new
solutions
“… our mainframe systems
represent a significant
component of our budget... The
UDP platform [built with Confluent]
enabled us to lower costs by
offloading work to the forward
cache and reducing demand on
our mainframe systems.”
Case study
Built a foundation for
next-gen applications to
drive digital
transformation
“… plays a critical role to
transform from monolithic,
mainframe-based ecosystem to
microservices based ecosystem
and enables a low-latency data
pipeline to drive digital
transformation.”
Online webinar

Axel Ludwig
The Problem with Legacy: Mainframe Offloading

1
Confluent Partner Tech Talk Q2
Agenda
06.06.2023
20
SVA
2 Problem & Requirements
3 CDC & Kafka Streams
4 Solution
5 Lessons Learned: The good, the bad and the ugly
6 Q&A session

KEY FACTS SVA
Locations
27
Wiesbaden
Founded 1997
+ 3,000 customers
+ 2,700 employees
1,557 Mio. € sales volume (2022)

SVA & CONFLUENT
Partnership
since 2016
Premier partner
status
SVA biggest
partner in Germany
More than 80
Confluent
consultants

Problem & Requirements
• Mainframe:
• Large data server used to compute up to billions of transactions per day
• Dominated data-centers in the last decades
• „Never change a running system“
• Long survival time in companies
• A lot of legacy jobs / patterns
• Very different to modern databases
• Customer situation:
• Logistics
• Uneven workloads (e.g. christmas, black friday)
• Sudden increase of workload (e.g. sudden sales)
• Additional HW needed to match workload
Problem
06.06.2023
24

Problem & Requirements
Requirements
06.06.2023
25
• Cloud based
• Flexible scaling
• Ability to handle increasing amount of use-cases
• Enterprise Support
• Mainframe
• No replacement possible
• No adaptation of legacy jobs possible
• Reducing SQL Interaction
• Stream processing
• Enablement of real-time use-cases
• Kafka Streams
• ksqlDB

CDC & Kafka Streams
• React to database changes in real-time
• Captures row-level changes of tables in a database
• Reads from the transaction log
• Captures all CRUD statements
• Ability to take consistent snapshot (SQL)
CDC – Change Data Capture
06.06.2023
27
{
"before": {
"public.tech_talk": {
"id": 0,
"participant": "Enter your name here",
"cdc_knowledge": null,
"kstreams_knowledge": null
}
},
"after": {
"public.tech_talk": {
"id": 0,
"participant": "Enter your name here",
"cdc_knowledge": "expert",
"kstreams_knowledge": "expert"
}
},
"source": {
"ts_ms": 1683713475751,
"db": "db2",
"schema": "public",
"table": "tech_talk",
"lsn": 12345
},
"op": "u",
"ts_ms": 1683713476255
}

CDC & Kafka Streams
Kafka Streams
06.06.2023 28
Stateful Operations
• Aggregations
• Windowing
• Joining
Stateless Operations
• Filtering
• Branching
• Key / Value Manipulations
• Grouping
Kafka Streams
Kafka Connect Kafka Connect

CDC & Kafka Streams
Kafka Streams
06.06.2023
29
• Higher level of abstraction than producer/consumer
• Read-Process-Write Pattern
• Java library, not a framework
• Applications can run in VMs, bare metal or k8s
• Handling state:
• State of stateful operations is stored local in RocksDB
• Backed by a changelog topic in Kafka for fault tolerance
• Topology:
• Directed Acyclic Graph
• Represents the data flow
• Records flow through topology,
one after the other

30
06.06.2023
Solution

Solution
• Legacy Jobs cannot be touched
• Legacy patterns like deleting all data from a table and inserting it from scratch for just a few changes
• Often done for masterdata updates
• Instead of having a few updates, we see a lot of deletes and inserts in the transaction log
🡪 A bunch of unnecessary events to be processed by all clients
🡪 What has actually changed?
• CDC / IIDR problems
• Difference between deleting data or truncating/dropping tables
• Usage of char columns instead of varchar („abc123 „)
• Correct timestamp formatting
• Custom transformations to single messages not possible (loosing enterprise support)
Problems to be solved
06.06.2023
31

Solution
Extract data from mainframe
• IBM Infosphere Data Replication
(IIDR) used as CDC-tool
• Define tables to extract
• Define topics to write into
Solution approach
06.06.2023
32
Landing-Zone
• KStreams application
• Necessary to clean/deduplicate
data
Business Logic
• Depending on business logic
stream processing is needed
• Consume data using custom
consumers or Sink-Connectors
DB2
IIDR Client
s

33
06.06.2023
Lessons Learned

Lessons Learned
CDC
✔ Enables to significantly speed up
use-cases
✔ React to changes instead of batches
− Depending on business logic state is
needed, row by row processing
doesn‘t work all the time
− Foreign keys doesn‘t match to Kafka‘s
ordering guarantees
− Nested data structure
− Data cleansing
The good, the bad and the ugly
06.06.2023
34
Kafka Streams CDC + Kafka Streams

Lessons Learned
CDC
use-cases
ordering guarantees
− Data cleansing
06.06.2023
35
Kafka Streams
✔ Real-time data processing
✔ Handle complex business logic
✔ Scalable to handle large volumes of
data
✔ Multiple Joins available including
Foreign Key Joins between KTables
− Foreign key joins need „full-table
scan“
CDC + Kafka Streams

Lessons Learned
CDC
use-cases
ordering guarantees
− Data cleansing
06.06.2023
36
Kafka Streams
✔ Real-time data processing
✔ Handle complex business logic
✔ Scalable to handle large volumes of
data
✔ Multiple Joins available including
Foreign Key Joins between KTables
− Foreign key joins need „full-table
scan“
CDC + Kafka Streams
• Anti-Patterns like previously
described masterdata updates
• Really depends on the data model
• Heavily normalized data models need
complex topologies to handle
business logic
• Worst Case: Foreign Key Joins with a
lot of state
• Handling TTL
• Often custom Transformer are
needed to meet business logic
requirements

Lessons Learned
What would we do differently?
06.06.2023
37
Get rid of anti-patterns first:
• Adapt legacy jobs first
• Create business objects instead of normalized data (e.g. with „outbox
pattern“)
• Get rid of foreign-keys
• Move complex and often reused join-logic to the database, especially
foreign-key joins
• Define application requirements as soon as possible to detect
anti-patterns
• Use Kafka-Connect instead of IIDR

Contact: Axel.Ludwig@sva.de
Q&A

Confluent Partner Tech Talk with SVA

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Confluent Partner Tech Talk with SVA

Similar to Confluent Partner Tech Talk with SVA (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Confluent Partner Tech Talk with SVA