This document discusses securing Apache Kafka deployments with Vormetric and Confluent Platform. It begins with an introduction to Apache Kafka and Confluent Platform. It then provides an overview of Vormetric's policy-driven security solution and how it can be used to encrypt Kafka data at rest. The document outlines the typical Confluent Platform deployment architecture and various security considerations, such as authentication, authorization, and data encryption. Finally, it provides steps for implementing secure deployments using SSL, Kerberos, and Vormetric encryption policies.
2. 2Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
3. 3Confidential
About Confluent and Apache Kafka
• Founded by the creators of Apache Kafka
• Founded September2014
• Technology developed while atLinkedIn
• 73%of active Kafka committers
Cheryl Dalrymple
CFO
Jay Kreps
CEO
Neha Narkhede
CTO, VP Engineering
Luanne Dauber
CMO
Leadership
Todd Barnett
VP WW Sales
Jabari Norton
VP Business Dev
5. 5Confidential
After: Stream Data Platform with Kafka
Distribute
d
Fault
Tolerant
Stores
Messages
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsMySQL Cassandra Oracle
Hadoop Elastic Search Splunk
Data
Warehouse
Kafka
Processes Streams
6. 6Confidential
What is a Stream Data Platform?
Kafka
Stream Data
Platform
Search
NoSQL
RDBMS Monitoring
Stream ProcessingReal-time Analytics Data Warehouse
Apps
Apps
Hadoop
Synchronous Req/Response
0 – 100s ms
Near Real Time
> 100s ms
Offline Batch
> 1 hour
Build streaming applications
Deploy streaming applications at scale
Monitor and manage streaming applications
Common Kafka Use Cases
• Log data
• Database changes
• Sensors and device data
• Monitoring streams
• Call data records
• Real-time Monitoring
• Asynchronous applications
• Fraud and security
• Bridge to Cloud
7. 7Confidential
People Using Kafka Today
Financial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail
8 of the top 10 insurance companies &
7 of the top 10 banks in the Fortune 500
9 of the top 10 telcos
in the Fortune 500
6 of the top 10 travel companies in the Fortune
500
8. 8Confidential
Confluent Platform: It’s Kafka ++
Feature Benefit Apache Kafka Confluent Platform 3.0 Confluent Enterprise 3.0
Apache Kafka
High throughput, low latency, high availability, secure distributed message
system
Kafka Connect
Advanced framework for connecting external sources
and destinations into Kafka
Java Client Provides easy integration into Java applications
Kafka Streams
Simple library that enables streaming application development within the
Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, Go, etc.
Rest Proxy
Provides universal access to Kafka from any network connected device via
HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is always
consumable
Pre-Built Connectors
HDFS, JDBC, Elastic and other connectors fully Certified
and fully supported by Confluent
Confluent Control Center Includes Connector Management and Stream Monitoring
Support
Connection and Monitoring command center provides advanced
functionality and control
Community Community 24x7x365
Free Free Subscription
9. 12Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
10. 13
Vormetric Company Overview
13
Smart Cloud
Enterprise Plus
Global Customers
• Over 1,500 customers
• 17 of the Fortune 30
Most Security Conscious Brands
• Largest financial institutions
• Largest retail companies
• Major manufacturers
• Third party business service providers
• Government agencies
Cloud Service Providers Trust Vormetric
Business Drivers
• Executive mandates
o Data breach, insider threat
• Compliance
• SLAs
”With Vormetric, people have no idea it’s even
running. Vormetric Encryption also saved us at least
nine months of application rewrite effort, and its
installation was one of the easiest we’ve ever
experienced. ”
-Karl Mudra, CIO, Delta Dental of Missouri
11. 15
Vormetric Data Security Platform
Application
Encryption
Vormetric
Data Security
Manager
Tokenization
Data
Masking
Key
Management
Security
Intelligence
Transparent
Encryption
Encryption
Gateway
KMaaS
12. 16
How do we Encrypt?
Sensitive Data Protection Technologies
▌ SSL, SSH,
HTTPS, IPSEC
Data in Motion
Between Devices
Data at Rest
ENCRYPTION,TOKENIZATION,
DATA MASKING
Application/Database
File System
Disk
Application/Database
File System
Disk
13. 17
Vormetric Transparent Encryption
Policy is used to restrict access
to sensitive data by user and
process information provided
by the Operating System.
Users
Application
Database
Operating System
FS Agent
File
Systems
Volume
Managers
SSL/TLS
*communication is only
required at system boot
14. 18
Policy Example: Kafka
Policy Summary:
Only the specified Kafka user, using only the
verified Java process has full read/write &
automatic encrypt/decrypt access to the
protected topic data.
Privileged admins and root accounts are
allowed to manage the protected data
without seeing the sensitive contents.
All other data requests are denied and
audited.
# Resource User Process Action Effects
any Kafka User Java Read / Write Permit
Encrypt / Decrypt
(audit optional)
any Root Whitelisted
management
processes
Metadata
Only
Permit
Audit
any * * * Deny & Audit
1
2
3
1
2
3
Policy Benefits
Data-at-rest encryption without changing configs
or application code.
Remove custodial risk of privileged root users
15. 19
Vormetric Security Intelligence
▌Log all access and attempted access to what
matters – the data
Reveals unauthorized access attempts to protected data
Find unusual access patterns
Identify compromised users, administrators and applications
Identify attacks on data such as APTs or malicious insiders
Prebuilt integrations: Splunk, ArcSight, Qradar, LogRhythm
16. 20Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
19. 23Confidential
Kafka Deployment Architecture (simplified)
Zookeeper
Producer /
ConsumerProducer /
ConsumerProducer /
Consumer
Producer /
Consumer
Broker
Broker
Broker
Broker
Broker
Zookeeper
Zookeeper
• Zookeeper quorum
manages metadata
• Broker nodes manage (and
store) topic data
• Brokers and Clients access
ZK nodes
• Brokers communicate
directly for replication
(many-to-many)
• Broker and Zookeeper
nodes utilize local storage.
20. 24Confidential
Kafka Deployment Architecture
Zookeeper
Producer /
ConsumerProducer /
ConsumerProducer /
Consumer
Producer /
Consumer
Broker
Broker
Broker
Broker
Broker
Zookeeper
Zookeeper
• Zookeeper quorum
manages metadata
• Broker nodes manage (and
store) topic data
• Brokers and Clients access
ZK nodes
• Brokers communicate
directly for replication
(many-to-many)
• Broker and Zookeeper
nodes utilize local storage.
21. 25Confidential
Security Options
• Authentication
• SSL certificates support for 1-way (broker-only) or 2-way (broker and client) authentication
• SASL challenge/response support via Kerberos
• Mix-n-match : SSL for wire-level encryption, SASL for authentication
• Authorization
• Access Control Lists
• Operations: Read, Write, Create, Describe, ClusterAction, ALL
• Resources: Topic, Cluster, ConsumerGroup
• NOTE: ACL’s stored in zookeeper (along with all topic metadata)
• Data Encryption
• Vormetric policy management
22. 26Confidential
Secure Deployments: Step by Step
• SSL Configuration
• Identify / deploy Certificate Authority
• Generate certificates (brokers, clients, or both)
• Share / Install certificates on brokers and/or clients
• Set Kafka broker properties to restrict communication to SSL channels
• Kerberos Configuration (SASL)
• Identify / deploy Kerberos principal
23. 27Confidential
Secure Deployments: Step by Step (continued)
• Data Encryption
• Identify / Deploy Vormetric DSM
• Configure cluster brokers and ZK nodes into DSM domain
• Create and distributed keys (could be coordinated with keys used by brokers and clients)
• Define encryption policy and apply policy to the storage directories
• (test/dev best-practice: exclude metadata operations from policy enforcement)
• References:
• http://docs.confluent.io/3.0.0/kafka/security.html
• <vormetric>
24. 28Confidential
Solution Benefits
• End-to-end security management … from Kafka topic to storage layer
• Robust access controls across all layers
• Fine grained access control
• Logical constraints on privileged users
• Alerting regarding in-band and out-of-band access attempts
35% of the Fortune 500
7 out of 10 of the top 10 Fortune 500 global banks
8 of the top 10 insurance companies
9 of the 10 top telecom companies
6 of the top 10 travel companies
Talking Points: 1. Companies are faced with very complex environments with difficult to manage parts. They want to organize large amounts of data into a well managed, unified stream data platform. 2. Customers use Confluent Platform for realtime, batch operational and analytical purposes. Take away the costly and labor intensive process of developing proprietary data replication practices and allow the Confluent Platform to make data available in realtime streams. 3. Our platform has Kafka at the core (same build as open source Kafka but with additional bug fixes applied) with components and tools that allow you successfully deploy to production, including:
Kafka
Schema management layer (ensures data compatibility across applications
Java and Rest clients that integrate with our schema management layer
Kafka Connect
Kafka Streams
Authentication and Authorization
Confluent Control Center
1,300 customers, including many of the largest enterprise customers trust us.
IBM and Symantec also deliver our products through OEM arrangements.
Separate Application and Database
TDE/Columnar
Vormetric Encryption Expert Agents are software agents that insert above the file system logical volume layers. The agents evaluate any attempt to access the protected data and apply predetermined policies to either grant or deny such attempts. This is a proven high-performance solution that transparently integrates into:
Linux,
UNIX, and
Windows operating systems
to protect data in
physical,
virtual, and
cloud environments.
across all leading applications, databases, operating systems, and storage devices.