Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase

by CQRS and Event Sourcing
using Akka, Kafka Streams and HBase
Worldwide Scalable and
Resilient Messaging Services
Shingo Omura
ChatWork Co., Ltd.
Masaru Dobashi
NTT DATA Corporation
© ChatWork and NTT DATA Corporation. 1

Agenda
• Introduction of Us and Our service “ChatWork”
• Technical Debts Blocked Our Growth
• Our Approach: CQRS + Event Sourcing with Akka, Kafka, HBase
• Technical Consideration to Build the Architecture
• Several Technical Tips

Who am I ?
Shingo OmuraAbout ChatWork Co., Ltd .
• Founded in 2004, in Japan
• 79 employees in total
• Raised $15M in funding so far
• 3 Office Locations : Japan, Taipei, U.S(California)
• Senior Software Engineer in ChatWork Co., Ltd .
• Specialized for
Distributed and Concurrent Computing

Who am I ?
Masaru Dobashi
• Senior Software Engineer and Architect of IT Platform
• Specialized for distributed computing,
open sources and infrastructures.
About NTT DATA Corporation
Common Stock
• ¥142,520 million
(as of March 31, 2016)
Business Area
• System integration
• Networking system services
• Other business activities related to the above

ChatWork and NTT DATA
• ChatWork is the project owner.
• In this project, NTT DATA is providing the technical support
about messaging systems and data stores in this project.

ChatWork (http://chatwork.com)
We Change World Works
• ChatWork is the enterprise grade global
team collaboration platform
• Group Chat, File sharing, Task management,
video conference all in one place
• All device support (PC, Android, iOS)
6 languages support

Demo

ChatWork (http://chatwork.com)
Easy for Cross-Organizational communication
• Chat Room, User namespace is shared by the whole
• You don’t need to sign in to multiple organizations
• You can add anyone, even in other organization, to chat rooms
• Stats
• 60% of users use for Internal/External communication
• 10% of users use only for External communication
• Typical Usecases = Business collaboration with their partners
• Publishers and Writers
• Franchise/Branch Operations
• Consulting firm and Clients (Accounting, Law, etc.)

ChatWork Grows Rapidly
138,000 companies in 205 countries and regions

ChatWork Grows Rapidly
2 Billion Messages sent globally!
Number of messages sent on ChatWork has been increased
along with user growth.

Characteristics of Our Workload
• 95% of message requests are"read"
• Large portion of reads are about "recent” messages
• But users sometime jump to very old messages via message links
• every task and file has its associated message links

Technical Debts That Blocked Our Growth
Cannot Scale-Up Anymore
• Using the biggest intance type (db.r3.8xlarge)
• Should be able to Scale-Out
ACID doesn’t scale
• ACID is hard to tune up performance
• We realized to accept weaker consistency model
Monolith is hard
• to deploy, to maintain, to optimize

What We Want To Get
on New Messaging Backend
Different Scalability and Resiliency Level
• Stateless Servers (API Servers)
• Can be fully elastic automatically
with high throughput and low latency
• Fault-tolerant and self-healing
• Statefull Servers (Storage)
• No need to be automatically elastic, just
scalable when we needed
• Expected to be fault-tolerant and somewhat
resilient
• Durable and Predictability is important
Acceptable consistency level
• Eventual consistent can be accepted
with reasonable/tunable delay
• Every member in a chatroom should see
message events in the same order

Our Approach:
CQRS(Command and Query Responsibility Segregation)
Build read side and write side independently
Pros: easy to optimize and be flexible
• Data Structure
• De-normalized data model can be used for read models
• Database Middleware
• Focus on either read-heavy or write-heavy
• System Capacity
• Can control system capacity independently
Cons:
• Confined Complexity in data transformation
• Operation overhead

Our Approach:
CQRS + Event Sourcing
Event Source
• History of every changes in application state
• It is stored in the sequence
Write model database can be append only
• Event is fact. It is already validated and authorized
• Fact won’t be updated in nature

Our Approach:
CQRS + Event Sourcing
Easy to build/rebuild read model
eventually
• We can mutate each event to read model
iteratively
• This can be seen as pre-computing query results
incrementally
• This process can be replayed to re-build read
model when needed by some incident.

Overall Architecture

What is Akka?
• Toolkit to build powerful and concurrent distributed application easily
based on actor-model programming
• Asynchronous and Distributed by Design:
• Easy to non-blocking and message-driven processing by Akka’s actor
• High Performance:
• 50 million msg/sec on a single machine.
• Small memory footprint; ~2.5 million actors per GB of heap
• Resilient by Design:
• Error Kernel Patterns and Let It Crash Pattern with Actor hierarchy
ref: http://akka.io/

Akka’s Wonderful Features for Us
Resilient by design
• Customizable and Flexible resiliency with Kafka Consumer/Stream
• With supervisor, we can restart KStream safely and cleanly without stopping JVM process
and implement flexible and graceful restart policies.

Akka’s Wonderful Features for Us
Non-Blocking
• Large Blocking Iteration can be converted to Akka-Stream(ex. HBase’s scan operation)
• We implemented HBaseScanStage
which transform scanner(iterator) to the stream emitting scanned HBase rows asynchronously
• This can achieve higher throughput and use threads fairly among multiple scan requests
• Attach Isolated thread pool for blocking-call
• We attach isolated thread pool for scan stage for avoiding akka starving threads
Source.fromGraph(new HBaseScanStage(connection, “message”, scan))
.withAttributes(
ActorAttributes.dispatcher("hbase-blocking-dispatcher")
)

What is Kafka?
• Apache Kafka is a messaging system and used to construct the
pipeline of data processing
• In our use case, Kafka is used as a central log system for Event
Sourcing. Kafka stores events generated by the write-api
servers.
• Idea similar to our platform
• https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-
apache-kafka-whats-connection/

Kafka‘s Wonderful Features for Us
• Kafka’s pub/sub model provides us the flexibility of developing services.
We can add and improve functions step by step.
• Kafka has both of the scalability and the reasonable guaranty of the
message order which fits our service design.

What is HBase?
• Apache HBase is a database for massive read/write operations
and effectively leverages Hadoop HDFS.
• In our use case, HBase is used to store data of the read model,
a master data of the communication service.

HBase‘s Wonderful Features for Us
• HBase is developed based on the stable architecture leveraging Hadoop
HDFS (We are used to Hadoop)
• HBase processes our read/write request workload effectively.
• The write requests of this service are random access. Fortunately, HBase converts
them to sequential access before writing data to disks.
• The read requests of this service tends to be heavy on the recent data.
Fortunately, HBase's read cache mechanism can handle such request efficiently.

What is Kafka Streams?
• Kafka Streams is a part of Apache Kafka, which provides us the stream
processing in the simple way.
• Wonderful features for us
• In our architecture, Kafka is a hub of the data pipeline, so that Kafka Streams is
already included in the environment.
• Even though Kafka Streams was a young component, the basic design seemed to
be simple and reasonable for us. We first tried it for stateless application.

Summary of Actual Performance
• Write API
• Throughput(in stress test): 40x of current peak with only 2 write-api pods
(4 core&5G mem/pod on m4.2xlarge instance)
• Latency(in production): 200ms à 80ms
• produce time to Kafka Brokers = 20ms (in production)

• Read API
• Throughput(in stress test): current peak with 4 read-api pods
(4 core&5G mem/pod on m4.2xlarge instance)
• Latency(in production): 70ms à 70ms
• HBase’s block cache hit rate = 99%!!! (in production)

• Read Model Updater
• Time lag until read model being updated: 80ms (in production)
• Resilient enough
• Akka supervisor safely restarts kafka streams without stopping pods
• Kafka consumer group itself is also resilient enough
• partition reassignment happens automatically
even when some of consumer pods are down (e.g rolling update)
and can keep processing event mutation

Technical Consideration Topics
• Guaranteeing the Order of Events in Kafka
• Integrating Message Events to Other Mircoservices
• Architecture Design To Realize Reasonable Fault Tolerancy and
Durability
• Kafka as Cushioning Layer
• Heterogeneous design of data store
• Error handling in each layer

Guaranteeing the Order of Events in Kafka
• Partition is the unit of guaranteeing event ordering in Kafka
• We use “chatroom id” for partition key to enclose events in chatroom to
specific parition

Guaranteeing the Order of Events in Kafka
• you should care
• keep partitioner simple
• Partitioner’s computational cost directly
affects to producer throughput
• default is recommended
• changing key→partition is dangerous
• If you use default practitioner, the number
of partitions must not be changed.
• This produces operational difficulties…
• we operate 1000 pertitions for message
event topic to get high concurrency in
read model updater
• Kafka doesn’t support automated
partition rebalancing….
• We have to edit huge json object to move
partition to new brokers...

Integrating Message Events to Other
Microservices
• Kafka is very useful for integrating to other services
• We currently one event forwarder which integrates to multiple existential services
• We are now adding event forwarder for outgoing webhook service
• Important: integrated service should be “idempotent”
• Event forwarder guarantee only “at-least-once” delivery
• Integrated service might receive the same event multiple times

Reasonable Fault Tolerance and Durability
• Important: Define your own level of fault tolerance and durability
• Unnecessarily high-level fault tolerance and durability tends to reveal terrible
complexity of the internal architecture.

Key Demands and Constraints for Us
Durability against faults of each node
Because the distributed system operates large number of nodes,
the actual probability to find errors on some nodes may be not so small.
Best efforts of durability against faults of the whole of a data center
We aimed to provide the readability of the historical data even in the case of
falling down of a data center as much as possible.
Scalability for EACH layer
The heterogeneous workload of write/read forces us
the individual scale out plan.

Architecture design to realize the reasonable fault tolerancy
and durability
Heterogeneous Design of Data Store
• The write-side
• The read-side
Data type
Only recent data
Requirements other than FT and durability
Small footprint and efficiency
Data type
Long term master data
Requirements other than FT and durability
Stability and predictability

Architecture design to realize the reasonable fault tolerance
and durability
Cushioning Layer
• In our design, Kafka has a role of “cushioning layer” as well as the hub of the
pipeline.
• We can reprocess old messages both automatically and manually when we find
errors. This is achieved by storing several generations of offsets in the output data
store and controlling offsets in the applications.

and durability
Error Handling in Each Layer (1/2)
• In reality, it is difficult to perfectly handle errors in a certain layer.
• For example, Kafka Streams didn’t provide the fine-grained error
handling at the time that we started this project. Some of errors
during processing records may cause application failures.

and durability
Error Handling in Each Layer (2/2)
• Fortunately, since Kafka Streams can be run as a single application, you can
wrap it by Akka Supervisor. This enables us to handle errors simply using
UncaughtExceptionHandler.
public void setUncaughtExceptionHandler(final Thread.UncaughtExceptionHandler eh)
streams.setUncaughtExceptionHandler(new UncaughtExceptionHandler {
override def uncaughtException(t: Thread, e: Throwable): Unit = {
self ! UncaughtExceptionInStream(e)
}
})
E.x.
Send messages to itself to trigger the
back off function of the supervisor.

Tips

Manual Offset Management of Kafka Streams
• Question
• How to handle both of writing the result and updating the offset information in one
sequence(or transaction)?
• Answer
• Manage "offset information" in the output data stores
• Way
• In case of Kafka Streams, we’ve implemented our own consumer which writes the
result and updates the offset information to the output data store.

KafkaClientSupplier
• You can use KafkaClientSupplier implementation to provide a custom
consumer & provider to KafkaStreams instance.
• However, Kafka Streams is not basically designed for such use cases, so
that it may be painful for you. You also need to be careful for frequent
updating of the offset information.
public KafkaStreams(final TopologyBuilder builder,
final StreamsConfig config,
final KafkaClientSupplier clientSupplier)
{ public interface KafkaClientSupplier {
Producer<byte[], byte[]> getProducer(final Map<String, Object> config);
Consumer<byte[], byte[]> getConsumer(final Map<String, Object> config);
Consumer<byte[], byte[]> getRestoreConsumer(final Map<String, Object> config);
}

Parallelism and Ordering
• As we told you in "Restriction about ordering of messages" page, the
guarantee of the order of events in each chat room is important for us.
• The configuration of parallelism of each component is important to realize
both of the high throughput and the guarantee of ordering.
• For example, Kafka Producer can reproduce messages due to some errors
but this may cause the reordering when you send data in parallel . To
prevent it, you can set max.in.flight.requests.per.connection to 1.

max.in.flight.requests.per.connection
• This parameter configures the maximum parallelism of sending requests,
etc.
Conditions:
queue == null || queue.isEmpty() ||
(queue.peekFirst().send.completed() && queue.size() < this.maxInFlightRequestsPerConnection);

Summary
• Why and How we build our messaging backend
with CQRS + Event Sourcing by Akka, Kafka, HBase
• How Akka, Kafka, HBase fits with the architecture and our
usecase
• Technical consideration topics to build the architecture
• Several Technical Tips

Thank you!
Any Questions?

Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase

Similar a Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase (20)

Más de DataWorks Summit

Más de DataWorks Summit (20)

Último

Último (20)

Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase