Deploying Kafka on DC/OS

Deploying Kafka on DC/OS
Kaufman Ng, Solutions Architect, Confluent

Who is this for?
• How to effectively deploy Kafka on
DC/OS?
• Things to watch out for
• Target audience:
– Administrators
– Architects

Agenda
• About Me
• About Confluent
• What is Apache Kafka?
• Why Kafka on DC/OS?
• Gotchas
• Questions

About Me
• Solutions Architect, Confluent
• Previously Senior Solutions Architect, Cloudera
• Contributor to Kafka, Parquet
• kaufman@confluent.io
• Twitter: @kaufmanng

About Confluent
• Founded by the creators of Apache Kafka
• Company founded September 2014
• Technology developed while at LinkedIn
• About 70% of active Kafka committers

Confluent Platform and Apache Kafka
Apache Kafka
• Open source
• Publish and Subscribe
• Processing
• Storage
Confluent Open Source
• Open source
• Data management
• Connectors
• Clients
Confluent Enterprise
• Administration features
• Operations features
• Monitoring features
• 30-day free evaluation

What is Apache Kafka?
• A distributed streaming platform
• Pub-sub paradigm similar to message
queue
• Fault-tolerant
• Allows stream processing of events as
they occur

What does Kafka do?
Producers
Consumers
Kafka Connect
Kafka Connect
Topic
Your interfaces to the world
Connected to your systems in real time
Apache Kafka 101

Before: Many Ad Hoc Pipelines
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational Metrics
Hadoop Search Monitoring
Data
Warehouse
Espresso Cassandra Oracle

After: Central Hub with Kafka
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsEspresso Cassandra Oracle
Hadoop Log Search Monitoring
Data
Warehouse
Kafka

What’s in a Kafka Cluster?
• A typical Kafka cluster consists of these:
– Broker
– Zookeeper
– Kafka Connect
– Kafka Streams
• Confluent Platform add-ons:
– Schema Registry
– REST Proxy
– And others (Replicator, Auto Data Balancer, and others)

Why Kafka on DC/OS?
• Ease of management
• Container support
• Stateful and stateless services
• Service discovery and routing

How do you manage all these?
• A cluster consists of a mix of stateful and
stateless services
• Plus other external systems
• Service discovery, load balancing
concerns
• DC/OS comes to the rescue

Where to place these services?
• Brokers should NOT be co-located
• Broker nodes should have dedicated disks
• Same for Zookeeper servers
• Brokers should not be co-located with
Zookeepers
• And which containerizer should I use?
Mesos, docker

DC/OS benefits for Kafka
• Easy to configure cluster size
• Node and placement constraints
• Handling of stateful services like brokers

How many brokers
• Recommended: 3 brokers to start
• At least 1-2 GB memory each
• A few CPU cores is ok (~4)
• Don’t place them together!
• In Marathon’s config.json:
"placement_constraint": "hostname:MAX_PER:1"
• Don’t change this: "PLACEMENT_STRATEGY": "NODE”

Service config sample
"service": {
"name": "confluent-kafka",
"mesos_api_version": "V0",
"user": "nobody",
"placement_constraint": "hostname:MAX_PER:1",
"deploy_strategy": "serial",
"virtual_network_enabled": false,
"virtual_network_name": "dcos",
"log_level": "INFO"
}

Broker config sample
"brokers": {
"cpus": 1,
"mem": 4096,
"disk": 5000,
"disk_type": ”MOUNT",
"disk_path": "kafka-broker-data",
"count": 3,
"port": 0,
"heap": {
"size": 2048
}
}

Kafka Storage Volumes
• Messages are flushed to disks
• HDDs are better than SSDs
• RAID is better than JBOD
• MOUNT volumes better than ROOT

Pinning the brokers
• You could tie brokers to nodes with bigger
storage by using placement constraints
• "placement_constraint": ”10.0.10.1|
10.0.10.2| 10.0.10.3"

What about Zookeeper?
• DC/OS comes with Zookeeper, should I
use it?
• Depends on how many services need ZK

Gotchas
• Restart brokers with DC/OS UI vs CLI
• Kafka Streams
• Failures

Kafka Broker Restarts
• Upon restart broker has to catch up with
others (because of replication, etc.)
• No rolling restarts in DC/OS UI
• Better to do rolling restart via CLI:
• dcos broker restart <broker_id>
• Check broker logs!

Kafka Streams with fault-tolerance
State stores

Kafka Streams with states
• Each Kafka Streams application instance has
its own embedded state store
• State stores have to be synced/assigned
upon restarts, that could take time
• When containers restart, they will be
spawned on different slave node, hence
empty state store -> longer startup time

Running other Kafka Components
• They are stateless and lighter on
resources
• Kafka Connect workers

Failures
• DC/OS will restart services when they fail
• But admins should look into why
• Use alerting tools, e.g. Nagios

Useful Links
• Service docs:
– Apache Kafka: https://docs.mesosphere.com/service-
docs/kafka/
– Confluent Platform:
https://docs.mesosphere.com/service-docs/confluent-
kafka/v2.0.0.1-3.3.0e/
• package source:
https://github.com/mesosphere/universe/tree/versi
on-3.x/repo/packages/C/confluent-kafka/

Other useful CLI tools
• Kafka-client docker image
https://hub.docker.com/r/mesosphere/kafk
a-client/
• Confluent Platform

Deploying Kafka on DC/OS

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Deploying Kafka on DC/OS

Similar a Deploying Kafka on DC/OS (20)

Último

Último (20)

Deploying Kafka on DC/OS