SlideShare a Scribd company logo
1 of 29
Download to read offline
How much can you connect?
Agenda
● Introduction
● Why we love Connect?
● Paradigms of Connect Clusters
● Learnings
● Wishlist
Disney + Hotstar
● Leading OTT Platform
● Growing SEA presence
○ ID, SG, MY, TH (Coming Soon!)
○ Localised experience
● VoD, Live TV, Live Sports
● 25.3 Mn → Highest # of Concurrent viewers
Kafka @ D+H
● Central Nervous System
○ Analytics
○ Inter-Service Communication
○ Distributed Workflow Management
○ Alerting, Logging, Monitoring
○ User Engagement
Kafka Connect @ D+H
> 2000
Connectors
● Higher # of Sink ( ~2k)
● Few Source (20-30)
Kafka Connect @ D+H
> 2000
Connectors
● Higher # of Sink ( ~2k)
● Few Source (20-30)
10-15
Connect Clusters
● Auto-scaling based on
traffic
● Traffic + Use case
sharding
Kafka Connect @ D+H
> 2000
Connectors
● Higher # of Sink ( ~2k)
● Few Source (20-30)
10-15
Connect Clusters
● Auto-scaling based on
traffic
● Traffic + Use case
sharding
We ❤ Connect
Simple, flexible, extensible
Configuration not code
● Add/Remove pipelines
● Scale-up & down
{
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"s3.region": "us-east-1",
"topics.dir": "fanned-raw-events/WATCHED_VIDEO_EVENTS",
"flush.size": "1000000000",
"timezone": "GMT",
"tasks.max": "200",
"s3.part.size": "5242880",
"transforms": "Sampler,Decrypter",
"locale": "en",
"format.class": "io.confluent.connect.s3.format.json.JsonFormat",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"s3.bucket.name": "xxxxxxxxxxx",
"partition.duration.ms": "60000",
"schema.compatibility": "NONE",
"topics.regex": "prod.knol.watched_video",
"key.converter.schemas.enable": "false",
"s3.compression.type": "gzip",
"name": "warehouse-sink-watched_video",
"value.converter.schemas.enable": "false",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"path.format": "'cd'=YYYY-MM-dd/'hr'=HH",
"timestamp.extractor": "Wallclock",
"rotate.schedule.interval.ms": "300000"
}
No more tuning
● Leave client configuration headache out
● Upgrade faster
● Get failure handling out of the way
Fan out quickly
Fan out quickly
Flexible Building Blocks
● Single Message Transform(SMT)
○ Masking/Filtering of PII
○ Sampling of data
○ Schema Validation & Enforcement
● Chaining of different transforms
● Lambda style architecture
○ Reusable
○ Easy to implement
Extensible
● Code/deploy custom connectors quickly
● Don’t look under the hood
Connect Deployment Paradigms
Terminology
1 Connector per cluster
All Connectors in 1 cluster
Connect Paradigms: Summary
✅ Isolation
✅ No rebalance impact
❌ Not efficient
❌ Too many Kafka topics for
connect management
✅ Efficient
✅ Lesser Topics
❌ Rebalance Storms
Connect Paradigms: Summary
✅ Isolation
✅ No rebalance impact
❌ Not efficient
❌ Too many Kafka topics for
connect management
✅ Efficient
✅ Lesser Topics
❌ Rebalance Storms
Learnings
Separate clusters
Traffic Patterns
● Optimize for scaling of the cluster
○ Uniform byte rate across tasks
● Metadata vs Data
○ Separate S2S/CDC data from other
flows
All tasks are not equal
Separate Clusters
Use-case
● # of configuration changes
● Resiliency to restarts & rebalances
Restart those connects
● Transient failures occur all the time
● Monitor JMX
● Focus on outcome, not status
○ Consumer-lag vs task-count
● Periodically restart
Don’t scale too slow
● More steps of scaling == More rebalances
● Ladder based scaling
○ Scale up/down slow in large increments
Don’t scale too slow
Wishlist
● YARN like support
○ Priority lanes
○ Elastic Capacity
● Non-Kafka based config & offset sources
● Task/Worker ratio
Thank you
Bhavesh Raheja
Twitter: @bhaveshraheja
tech.hotstar.com

More Related Content

What's hot

Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...HostedbyConfluent
 
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...HostedbyConfluent
 
How a Data Mesh is Driving our Platform | Trey Hicks, Gloo
How a Data Mesh is Driving our Platform | Trey Hicks, GlooHow a Data Mesh is Driving our Platform | Trey Hicks, Gloo
How a Data Mesh is Driving our Platform | Trey Hicks, GlooHostedbyConfluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...Understanding Kafka Produce and Fetch api calls for high throughtput applicat...
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...HostedbyConfluent
 
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...HostedbyConfluent
 
How to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, StripeHow to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, StripeHostedbyConfluent
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedGuozhang Wang
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...HostedbyConfluent
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
 
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaLessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaHostedbyConfluent
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
 
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterStream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterHostedbyConfluent
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxyconfluent
 
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMAvailability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMHostedbyConfluent
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...HostedbyConfluent
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...confluent
 

What's hot (20)

Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...
 
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
 
How a Data Mesh is Driving our Platform | Trey Hicks, Gloo
How a Data Mesh is Driving our Platform | Trey Hicks, GlooHow a Data Mesh is Driving our Platform | Trey Hicks, Gloo
How a Data Mesh is Driving our Platform | Trey Hicks, Gloo
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...Understanding Kafka Produce and Fetch api calls for high throughtput applicat...
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...
 
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
 
How to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, StripeHow to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, Stripe
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
 
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaLessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterStream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMAvailability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
 
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 

Similar to How Much Can You Connect? | Bhavesh Raheja, Disney + Hotstar

IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...
IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...
IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...Peter Waher
 
Gluster dev session #6 understanding gluster's network communication layer
Gluster dev session #6  understanding gluster's network   communication layerGluster dev session #6  understanding gluster's network   communication layer
Gluster dev session #6 understanding gluster's network communication layerPranith Karampuri
 
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATSKubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATSNATS
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Julien Le Dem
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceHBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce HBaseCon
 
RethinkConn 2022!
RethinkConn 2022!RethinkConn 2022!
RethinkConn 2022!NATS
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...Amazon Web Services
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개Ha-Yang(White) Moon
 
Easy, Secure, and Fast: Using NATS.io for Streams and Services
Easy, Secure, and Fast: Using NATS.io for Streams and ServicesEasy, Secure, and Fast: Using NATS.io for Streams and Services
Easy, Secure, and Fast: Using NATS.io for Streams and ServicesNATS
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfScyllaDB
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang
 
MACHBASE_NEO
MACHBASE_NEOMACHBASE_NEO
MACHBASE_NEOMACHBASE
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 

Similar to How Much Can You Connect? | Bhavesh Raheja, Disney + Hotstar (20)

IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...
IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...
IEEE Standards Impact in IoT and 5G, Day 1, Session 2 - Communication & Opera...
 
Gluster dev session #6 understanding gluster's network communication layer
Gluster dev session #6  understanding gluster's network   communication layerGluster dev session #6  understanding gluster's network   communication layer
Gluster dev session #6 understanding gluster's network communication layer
 
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATSKubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
RethinkConn 2022!
RethinkConn 2022!RethinkConn 2022!
RethinkConn 2022!
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개
 
Easy, Secure, and Fast: Using NATS.io for Streams and Services
Easy, Secure, and Fast: Using NATS.io for Streams and ServicesEasy, Secure, and Fast: Using NATS.io for Streams and Services
Easy, Secure, and Fast: Using NATS.io for Streams and Services
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdf
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
MACHBASE_NEO
MACHBASE_NEOMACHBASE_NEO
MACHBASE_NEO
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

How Much Can You Connect? | Bhavesh Raheja, Disney + Hotstar

  • 1. How much can you connect?
  • 2. Agenda ● Introduction ● Why we love Connect? ● Paradigms of Connect Clusters ● Learnings ● Wishlist
  • 3. Disney + Hotstar ● Leading OTT Platform ● Growing SEA presence ○ ID, SG, MY, TH (Coming Soon!) ○ Localised experience ● VoD, Live TV, Live Sports ● 25.3 Mn → Highest # of Concurrent viewers
  • 4. Kafka @ D+H ● Central Nervous System ○ Analytics ○ Inter-Service Communication ○ Distributed Workflow Management ○ Alerting, Logging, Monitoring ○ User Engagement
  • 5. Kafka Connect @ D+H > 2000 Connectors ● Higher # of Sink ( ~2k) ● Few Source (20-30)
  • 6. Kafka Connect @ D+H > 2000 Connectors ● Higher # of Sink ( ~2k) ● Few Source (20-30) 10-15 Connect Clusters ● Auto-scaling based on traffic ● Traffic + Use case sharding
  • 7. Kafka Connect @ D+H > 2000 Connectors ● Higher # of Sink ( ~2k) ● Few Source (20-30) 10-15 Connect Clusters ● Auto-scaling based on traffic ● Traffic + Use case sharding
  • 8. We ❤ Connect Simple, flexible, extensible
  • 9. Configuration not code ● Add/Remove pipelines ● Scale-up & down { "connector.class": "io.confluent.connect.s3.S3SinkConnector", "s3.region": "us-east-1", "topics.dir": "fanned-raw-events/WATCHED_VIDEO_EVENTS", "flush.size": "1000000000", "timezone": "GMT", "tasks.max": "200", "s3.part.size": "5242880", "transforms": "Sampler,Decrypter", "locale": "en", "format.class": "io.confluent.connect.s3.format.json.JsonFormat", "value.converter": "org.apache.kafka.connect.json.JsonConverter", "key.converter": "org.apache.kafka.connect.json.JsonConverter", "s3.bucket.name": "xxxxxxxxxxx", "partition.duration.ms": "60000", "schema.compatibility": "NONE", "topics.regex": "prod.knol.watched_video", "key.converter.schemas.enable": "false", "s3.compression.type": "gzip", "name": "warehouse-sink-watched_video", "value.converter.schemas.enable": "false", "storage.class": "io.confluent.connect.s3.storage.S3Storage", "path.format": "'cd'=YYYY-MM-dd/'hr'=HH", "timestamp.extractor": "Wallclock", "rotate.schedule.interval.ms": "300000" }
  • 10. No more tuning ● Leave client configuration headache out ● Upgrade faster ● Get failure handling out of the way
  • 13. Flexible Building Blocks ● Single Message Transform(SMT) ○ Masking/Filtering of PII ○ Sampling of data ○ Schema Validation & Enforcement ● Chaining of different transforms ● Lambda style architecture ○ Reusable ○ Easy to implement
  • 14. Extensible ● Code/deploy custom connectors quickly ● Don’t look under the hood
  • 17. 1 Connector per cluster
  • 18. All Connectors in 1 cluster
  • 19. Connect Paradigms: Summary ✅ Isolation ✅ No rebalance impact ❌ Not efficient ❌ Too many Kafka topics for connect management ✅ Efficient ✅ Lesser Topics ❌ Rebalance Storms
  • 20. Connect Paradigms: Summary ✅ Isolation ✅ No rebalance impact ❌ Not efficient ❌ Too many Kafka topics for connect management ✅ Efficient ✅ Lesser Topics ❌ Rebalance Storms
  • 22. Separate clusters Traffic Patterns ● Optimize for scaling of the cluster ○ Uniform byte rate across tasks ● Metadata vs Data ○ Separate S2S/CDC data from other flows
  • 23. All tasks are not equal
  • 24. Separate Clusters Use-case ● # of configuration changes ● Resiliency to restarts & rebalances
  • 25. Restart those connects ● Transient failures occur all the time ● Monitor JMX ● Focus on outcome, not status ○ Consumer-lag vs task-count ● Periodically restart
  • 26. Don’t scale too slow ● More steps of scaling == More rebalances ● Ladder based scaling ○ Scale up/down slow in large increments
  • 28. Wishlist ● YARN like support ○ Priority lanes ○ Elastic Capacity ● Non-Kafka based config & offset sources ● Task/Worker ratio
  • 29. Thank you Bhavesh Raheja Twitter: @bhaveshraheja tech.hotstar.com