SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Building Stateful Systems with
Akka Cluster Sharding
Presented By:
Hugh Mckee
Himanshu Gupta
Anjali Sharma
Before we start…
1. Please use the Q&A section to post your questions and raise your hand after
the webinar to discuss your questions with experts
2. Session is recorded and we will be sharing with you after the session.
3. We will send a follow up mail with all links and downloads soon.
About the Speakers
Hugh Mckee
Developer Advocate at Lightbend
● Speaker and advocate for development of Reactive Cloud Native Systems
Himanshu Gupta
Akka Expert and Sr. Lead Consultant at Knoldus Inc.
● Speaker for Fast Data Systems and Reactive Application Engineer.
Anjali Sharma
Software Consultant at Knoldus Inc.
● Developer Engineer specialised in Scala, Akka and Spark.
Agenda
What is Cluster Sharding?
Understanding Entity/Shard Ids
Sharding Example
What are Stateful Systems?
How to use Stateful Actors?
No more Blocking
Passivation
About Knoldus
Product Engineering for Innovative Organizations
Keeping your business competitive & future-ready with extremely well-engineered systems
through the unwavering pursuit of emerging technology, high-quality engineers,
processes, and practices
REACTIVE
PRODUCTS
Microservices & API
●
●
●
●
●
ENTERPRISE
DATA PROGRAM
Data Lake
●
ARTIFICIAL
INTELLIGENCE
Machine Learning
Data Science
Deep Learning
●
●
●
●
BLOCKCHAIN
●
●
●
Knoldus Practice Areas
Fast Data
●
●
●
Agile
Transformation
Reactive UI/UX
Test Automation
Practice
Reactive DevOps
Product
Engineering
Knoldus Global Presence
10+ Years
Years of Profitable Growth
175+ Engineers
Reactive products, Fast Data strategy, AI
04 Offices
Toronto, Chicago, Singapore, India
20+ Customers
Multi-year Global Customers
About Lightbend
Lightbend empowers organizations to quickly implement any digitally transformative business
strategy—no matter how ambitious, challenging or innovative.
We take care of the architectural hurdles and back-end complexity of building globally distributed,
cloud-native application environments. Lightbend enables development teams with the technology
and expertise required to build applications that support business critical decisions. That’s why Global
2000 enterprises turn to us.
Unleash the full power of the cloud with Lightbend.
What is Cluster Sharding ?
Sharding:
● The term Sharding means Partitioning.
● It's a technique that mostly databases use to improve
their elasticity and resiliency.
What is Cluster Sharding ?
Database Sharding:
● Records are distributed across nodes, using a shard key
or a partition key.
● A router which directs requests to the appropriate
shard or partition.
● Even after sharding, it may lead to bottleneck.
What is Cluster Sharding ?
Akka Cluster Sharding:
● The Akka toolkit provides cluster sharding as a way to
introduce sharding into your application.
● Instead of distributing database records across a
cluster, we are going to distribute actors across the
cluster.
● Each actor is then going to act as a consistency
boundary, for the data that it manages.
Components of Cluster Sharding ?
Entities:
● The basic unit in akka cluster sharding is an actor
called an entity.
● There is only one entity per entity ID in the cluster.
● Messages are addressed to the entity ID and processed
by the entity. This allows the entity to act as a single
source of truth, acting as a consistency boundary for
the data that it manages.
Components of Cluster Sharding ?
Shards:
● Entities are distributed in shards.
● Each shard manages a number of entities and creates
entity actors on demand.
● And each shard has a unique ID mapping entities to a
shard ID is how we control the distribution.
Components of Cluster Sharding ?
Shard Region:
● Shards gets distribute into different shard regions. Each
shard region contains a number of shards.
● For a type of entity, there is usually one shard region per
JVM.
● A shard region will look up the location of the shard for
the entity the first time when it doesn’t already know its
location, and then forwards the messages to the
appropriate node region, and the entity.
Components of Cluster Sharding ?
Shard Coordinator:
● The shard coordinator is responsible to manage shards,
it’s a cluster singleton.
● It’s responsible for ensuring that the system knows
where to send messages addressed to a specific entity.
● And it decides which shard gets to live in which region,
which is to stay on which node.
Understanding Entity ID
● To uniquely identify each entity, entityIDs are used.
● They are used to create name of the actor and hence must be unique across the entire
cluster.
● Entity Id Extractors are used to process each incoming message and separate it into an
entity id and a message to be passed to the entity actor.
case class MyMessage(entityID: String, message: String)
val idExtractor: ShardRegion.ExtractorEntityId = {
case MyMessage(id, message) => (id, message)
}
Understanding Shard ID
val shardIdExtractor: ShardRegion.ExtractShardId = {
case MyMessage(id, _) =>
(Math.abs(id.hashCode % totalShards)).toString
}
● To identify shards, Shard Ids are used.
● Entities are mapped to a Shard Id.
● An Extractor function is used to process each incoming message and produce Shard Id.
● Best practice is to aim for roughly 10 shards per node.
● When selecting a ShardId and producing an extractor it is important to consider how the
Shards will be balanced.
● Poor sharding strategy will produce hotspots which result in uneven workload.
Sharding Example
val shards = ClusterSharding(myActorSystem).start(
“shardedActors”,
MyShardedActor.props(),
ClusterShardingSettings(myActorSystem),
idExtractor,
shardIdExtractor
)
● ClusterSharding.start is called on each node that will be hosting shards.
● The role of the above block of code is to provide an actor ref which is the reference for
the local shard region.
● For sending messages we have to take the shard region actor ref and we send it the
message we’re expecting.
● Messages are first sent to the entities, through the local shard region.
shards ! MyMessage(entityId, someMessage)
What are Stateful Systems?
First, stateless systems
CART-1234
CART-1234
Temp Hot State
Cold State
1. Retrieve state
2. Change state
3. Save state
4. Forget state
Retrieve
What are Stateful Systems?
First, stateless systems
CART-1234
CART-1234
Temp Hot State
Cold State
1. Retrieve state
2. Change state
3. Save state
4. Forget state
Change
What are Stateful Systems?
First, stateless systems
CART-1234
CART-1234
Temp Hot State
Cold State
1. Retrieve state
2. Change state
3. Save state
4. Forget state
Save
What are Stateful Systems?
First, stateless systems
CART-1234Cold State
1. Retrieve state
2. Change state
3. Save state
4. Forget state
What are Stateful Systems?
CART-1234
CART-1234Cold State
CART-1234
Contention handled by the database
First, stateless systems
What are Stateful Systems?
CART-1234
CART-1234
Hot State
Cold State
Stateful systems
Retrieve state on 1st access
What are Stateful Systems?
CART-1234
CART-1234
Hot State
Cold State
Stateful systems
Save incremental state changes
What are Stateful Systems?
Stateful systems
CART-1234
CART-1234
Load Balancer
What are Stateful Systems?
Stateful systems
CART-1234
CART-1234
Load Balancer
How to use Stateful Actors
Akka
Cluster
Sharding
https://github.com/mckeeh3/akka-typed-java-cluster-sharding.git
No More Blocking
Why blocking inside an Actor is bad?
Blocking inside an actor can tie-up a thread inside an
Actor which cannot be reused by other Actors when
required. Hence it creates resource contention.
Note: Generally DB operations are blocking in an application.
Non-Blocking Requires Extra Care
● Next message can’t be processed until the previous message is complete.
● What to do in case a non-blocking operation fails?
Handling Non-Blocking Failures
● We need to be careful when updating data asynchronously in DB.
● Because if the update fails, then state will become inconsistent.
Handling Non-Blocking Failures
● We should fail the actor so that it can restart itself.
● Because on restart the actor will reload the state from DB, and the state will be
consistent.
Passivation
● Keeping the state of all the actors in memory is a huge risk.
● As it can fill the memory fast and cause OOM (OutOfMemory exception).
● Hence Akka Cluster Sharding provides a way to remove idle actors from
memory known as Passivation.
How Passivation works?
● Passivation works on a configurable time span.
● For every actor the time of last processed is tracked.
● In case an actor has not processed a message for the configured time span,
then it is removed from the Actor System.
How Passivation works? (contd.)
● Now, as soon as the actor starts receiving messages, it’s state is loaded back
from the DB.
● Since the actor was removed from the memory, all it’s state was lost.
Note: While the actor was not present in the memory, it’s messages are stored in
buffer. Hence they remain safe.
Configuring Passivation
● Using passivate-idle-entity-after setting we can configure when entities will
passivate.
● By default it’s value is 120 seconds.
Demo
Questions?
References
● Reactive Banking Sample Code
● Akka Cluster Sharding - Scala
www.knoldus.com
+(1) 647-467-4396
linkedin.com/company/knoldus
@knolspeak
Thank You!
Stay in Touch

Más contenido relacionado

La actualidad más candente

ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
alex_araujo
 
SteelEye 표준 제안서
SteelEye 표준 제안서SteelEye 표준 제안서
SteelEye 표준 제안서
Yong-uk Choe
 

La actualidad más candente (20)

Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
 
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
 
Spark Summit EU talk by Luc Bourlier
Spark Summit EU talk by Luc BourlierSpark Summit EU talk by Luc Bourlier
Spark Summit EU talk by Luc Bourlier
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
SteelEye 표준 제안서
SteelEye 표준 제안서SteelEye 표준 제안서
SteelEye 표준 제안서
 
My First 90 days with Vitess
My First 90 days with VitessMy First 90 days with Vitess
My First 90 days with Vitess
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architecture
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Tutorial ceph-2
Tutorial ceph-2Tutorial ceph-2
Tutorial ceph-2
 
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
Introduction to Grafana Loki
Introduction to Grafana LokiIntroduction to Grafana Loki
Introduction to Grafana Loki
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Concurrency Control in MongoDB 3.0
Concurrency Control in MongoDB 3.0Concurrency Control in MongoDB 3.0
Concurrency Control in MongoDB 3.0
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 

Similar a Building stateful systems with akka cluster sharding

Clustering In The Wild
Clustering In The WildClustering In The Wild
Clustering In The Wild
Sergio Bossa
 

Similar a Building stateful systems with akka cluster sharding (20)

Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actors
 
Introduction to concurrent programming with Akka actors
Introduction to concurrent programming with Akka actorsIntroduction to concurrent programming with Akka actors
Introduction to concurrent programming with Akka actors
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And Design
 
CrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked dataCrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked data
 
Cluster
ClusterCluster
Cluster
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
Clustering In The Wild
Clustering In The WildClustering In The Wild
Clustering In The Wild
 
Actor model in .NET - Akka.NET
Actor model in .NET - Akka.NETActor model in .NET - Akka.NET
Actor model in .NET - Akka.NET
 
Reactive mistakes reactive nyc
Reactive mistakes   reactive nycReactive mistakes   reactive nyc
Reactive mistakes reactive nyc
 
Discovering the Service Fabric's actor model
Discovering the Service Fabric's actor modelDiscovering the Service Fabric's actor model
Discovering the Service Fabric's actor model
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper
 
Dive into Akka Actors
Dive into Akka ActorsDive into Akka Actors
Dive into Akka Actors
 
Reactive programming with akka
Reactive programming with akka Reactive programming with akka
Reactive programming with akka
 
Reactive Programming in Akka
Reactive Programming in AkkaReactive Programming in Akka
Reactive Programming in Akka
 
Deep Dive Into Elasticsearch
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into Elasticsearch
 
Stateful streaming data pipelines
Stateful streaming data pipelinesStateful streaming data pipelines
Stateful streaming data pipelines
 
Akka Cluster in Production
Akka Cluster in ProductionAkka Cluster in Production
Akka Cluster in Production
 
Retaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingRetaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate Limiting
 
DotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NETDotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NET
 
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and IgniteJCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
 

Más de Knoldus Inc.

Más de Knoldus Inc. (20)

Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptx
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptx
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 

Último

Último (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Building stateful systems with akka cluster sharding

  • 1. Building Stateful Systems with Akka Cluster Sharding Presented By: Hugh Mckee Himanshu Gupta Anjali Sharma
  • 2. Before we start… 1. Please use the Q&A section to post your questions and raise your hand after the webinar to discuss your questions with experts 2. Session is recorded and we will be sharing with you after the session. 3. We will send a follow up mail with all links and downloads soon.
  • 3. About the Speakers Hugh Mckee Developer Advocate at Lightbend ● Speaker and advocate for development of Reactive Cloud Native Systems Himanshu Gupta Akka Expert and Sr. Lead Consultant at Knoldus Inc. ● Speaker for Fast Data Systems and Reactive Application Engineer. Anjali Sharma Software Consultant at Knoldus Inc. ● Developer Engineer specialised in Scala, Akka and Spark.
  • 4. Agenda What is Cluster Sharding? Understanding Entity/Shard Ids Sharding Example What are Stateful Systems? How to use Stateful Actors? No more Blocking Passivation
  • 5. About Knoldus Product Engineering for Innovative Organizations Keeping your business competitive & future-ready with extremely well-engineered systems through the unwavering pursuit of emerging technology, high-quality engineers, processes, and practices
  • 6. REACTIVE PRODUCTS Microservices & API ● ● ● ● ● ENTERPRISE DATA PROGRAM Data Lake ● ARTIFICIAL INTELLIGENCE Machine Learning Data Science Deep Learning ● ● ● ● BLOCKCHAIN ● ● ● Knoldus Practice Areas Fast Data ● ● ● Agile Transformation Reactive UI/UX Test Automation Practice Reactive DevOps Product Engineering
  • 7. Knoldus Global Presence 10+ Years Years of Profitable Growth 175+ Engineers Reactive products, Fast Data strategy, AI 04 Offices Toronto, Chicago, Singapore, India 20+ Customers Multi-year Global Customers
  • 8. About Lightbend Lightbend empowers organizations to quickly implement any digitally transformative business strategy—no matter how ambitious, challenging or innovative. We take care of the architectural hurdles and back-end complexity of building globally distributed, cloud-native application environments. Lightbend enables development teams with the technology and expertise required to build applications that support business critical decisions. That’s why Global 2000 enterprises turn to us. Unleash the full power of the cloud with Lightbend.
  • 9. What is Cluster Sharding ? Sharding: ● The term Sharding means Partitioning. ● It's a technique that mostly databases use to improve their elasticity and resiliency.
  • 10. What is Cluster Sharding ? Database Sharding: ● Records are distributed across nodes, using a shard key or a partition key. ● A router which directs requests to the appropriate shard or partition. ● Even after sharding, it may lead to bottleneck.
  • 11. What is Cluster Sharding ? Akka Cluster Sharding: ● The Akka toolkit provides cluster sharding as a way to introduce sharding into your application. ● Instead of distributing database records across a cluster, we are going to distribute actors across the cluster. ● Each actor is then going to act as a consistency boundary, for the data that it manages.
  • 12. Components of Cluster Sharding ? Entities: ● The basic unit in akka cluster sharding is an actor called an entity. ● There is only one entity per entity ID in the cluster. ● Messages are addressed to the entity ID and processed by the entity. This allows the entity to act as a single source of truth, acting as a consistency boundary for the data that it manages.
  • 13. Components of Cluster Sharding ? Shards: ● Entities are distributed in shards. ● Each shard manages a number of entities and creates entity actors on demand. ● And each shard has a unique ID mapping entities to a shard ID is how we control the distribution.
  • 14. Components of Cluster Sharding ? Shard Region: ● Shards gets distribute into different shard regions. Each shard region contains a number of shards. ● For a type of entity, there is usually one shard region per JVM. ● A shard region will look up the location of the shard for the entity the first time when it doesn’t already know its location, and then forwards the messages to the appropriate node region, and the entity.
  • 15. Components of Cluster Sharding ? Shard Coordinator: ● The shard coordinator is responsible to manage shards, it’s a cluster singleton. ● It’s responsible for ensuring that the system knows where to send messages addressed to a specific entity. ● And it decides which shard gets to live in which region, which is to stay on which node.
  • 16. Understanding Entity ID ● To uniquely identify each entity, entityIDs are used. ● They are used to create name of the actor and hence must be unique across the entire cluster. ● Entity Id Extractors are used to process each incoming message and separate it into an entity id and a message to be passed to the entity actor. case class MyMessage(entityID: String, message: String) val idExtractor: ShardRegion.ExtractorEntityId = { case MyMessage(id, message) => (id, message) }
  • 17. Understanding Shard ID val shardIdExtractor: ShardRegion.ExtractShardId = { case MyMessage(id, _) => (Math.abs(id.hashCode % totalShards)).toString } ● To identify shards, Shard Ids are used. ● Entities are mapped to a Shard Id. ● An Extractor function is used to process each incoming message and produce Shard Id. ● Best practice is to aim for roughly 10 shards per node. ● When selecting a ShardId and producing an extractor it is important to consider how the Shards will be balanced. ● Poor sharding strategy will produce hotspots which result in uneven workload.
  • 18. Sharding Example val shards = ClusterSharding(myActorSystem).start( “shardedActors”, MyShardedActor.props(), ClusterShardingSettings(myActorSystem), idExtractor, shardIdExtractor ) ● ClusterSharding.start is called on each node that will be hosting shards. ● The role of the above block of code is to provide an actor ref which is the reference for the local shard region. ● For sending messages we have to take the shard region actor ref and we send it the message we’re expecting. ● Messages are first sent to the entities, through the local shard region. shards ! MyMessage(entityId, someMessage)
  • 19. What are Stateful Systems? First, stateless systems CART-1234 CART-1234 Temp Hot State Cold State 1. Retrieve state 2. Change state 3. Save state 4. Forget state Retrieve
  • 20. What are Stateful Systems? First, stateless systems CART-1234 CART-1234 Temp Hot State Cold State 1. Retrieve state 2. Change state 3. Save state 4. Forget state Change
  • 21. What are Stateful Systems? First, stateless systems CART-1234 CART-1234 Temp Hot State Cold State 1. Retrieve state 2. Change state 3. Save state 4. Forget state Save
  • 22. What are Stateful Systems? First, stateless systems CART-1234Cold State 1. Retrieve state 2. Change state 3. Save state 4. Forget state
  • 23. What are Stateful Systems? CART-1234 CART-1234Cold State CART-1234 Contention handled by the database First, stateless systems
  • 24. What are Stateful Systems? CART-1234 CART-1234 Hot State Cold State Stateful systems Retrieve state on 1st access
  • 25. What are Stateful Systems? CART-1234 CART-1234 Hot State Cold State Stateful systems Save incremental state changes
  • 26. What are Stateful Systems? Stateful systems CART-1234 CART-1234 Load Balancer
  • 27. What are Stateful Systems? Stateful systems CART-1234 CART-1234 Load Balancer
  • 28. How to use Stateful Actors Akka Cluster Sharding https://github.com/mckeeh3/akka-typed-java-cluster-sharding.git
  • 29. No More Blocking Why blocking inside an Actor is bad? Blocking inside an actor can tie-up a thread inside an Actor which cannot be reused by other Actors when required. Hence it creates resource contention. Note: Generally DB operations are blocking in an application.
  • 30. Non-Blocking Requires Extra Care ● Next message can’t be processed until the previous message is complete. ● What to do in case a non-blocking operation fails?
  • 31. Handling Non-Blocking Failures ● We need to be careful when updating data asynchronously in DB. ● Because if the update fails, then state will become inconsistent.
  • 32. Handling Non-Blocking Failures ● We should fail the actor so that it can restart itself. ● Because on restart the actor will reload the state from DB, and the state will be consistent.
  • 33. Passivation ● Keeping the state of all the actors in memory is a huge risk. ● As it can fill the memory fast and cause OOM (OutOfMemory exception). ● Hence Akka Cluster Sharding provides a way to remove idle actors from memory known as Passivation.
  • 34. How Passivation works? ● Passivation works on a configurable time span. ● For every actor the time of last processed is tracked. ● In case an actor has not processed a message for the configured time span, then it is removed from the Actor System.
  • 35. How Passivation works? (contd.) ● Now, as soon as the actor starts receiving messages, it’s state is loaded back from the DB. ● Since the actor was removed from the memory, all it’s state was lost. Note: While the actor was not present in the memory, it’s messages are stored in buffer. Hence they remain safe.
  • 36. Configuring Passivation ● Using passivate-idle-entity-after setting we can configure when entities will passivate. ● By default it’s value is 120 seconds.
  • 37. Demo
  • 39. References ● Reactive Banking Sample Code ● Akka Cluster Sharding - Scala