SlideShare una empresa de Scribd logo
1 de 16
Maximizing Scale and
Throughput
using Akka and Aeropsike
Tal Obsfeld,
Development Team
Leader
12/9/2017
Agenda
• A little about Clicktale Core
• Challenges
• Guidelines
• Akka
• When to Act(or)
• Pool it together?
• Throughput vs. Latency
• Blocking I/O Execution Context
• Aerospike
• Data
• Server
• Client
2Confidential
3Confidential
High Availability
Low Latency
Peaks
Data Variance
Cost
Challenges
4Confidential
• Data from Client
• Protecting Browser Client
• IO context
• Error Handling
Guidelines
Akka Actors
What works best?
6Confidential
Read User State
Decide What to Do
Write New
User State
Write Session
Metadata
for {
state <- db.readUserState(userId)
result <- Future.sequence(
Array(
db.writeSessionMetadata(metadata),
db.writeUserState(Recording)
)
) if state == NotRecording
} yield Response
When to Actor(or)
Send Response to Client
7Confidential
Read Configuration from Zookeeper
Connect to Rabbit Connect to Aerospike
Initialize Data Structures
Bind Web Service
Read Rules from DB
Futures
Actors
(FSM)
Akka Streams
(Graph)
When to Actor(or)
8Confidential
• Mutable State with concurrent access
• State Machines
• Connections (Reconnections)
• Queues (Rabbit/Kafka)
• Reading and Spreading Configuration
• Throttling
When to Actor(or)
9Confidential
Router
Actor
Business
Logic Actor
Business
Logic Actor
Business
Logic Actor
Web Request
Handler
Business
Logic Actor
Which option is better?
Web Request
Handler
Pool it together?
10Confidential
default-dispatcher {
fork-join-executor {
parallelism-min = 2
parallelism-factor = 1.0
parallelism-max = 10
}
throughput = 300
throughput-deadline-time = 100
}
Throughput vs. Latency
throughput throughput-
deadline-time
Requests
Per Second
Latency
5 -1 9,000 -
300 100 7,000
30%
Lower
11Confidential
Blocking I/O Execution Context
contexts {
long-blocking-io {
executor = "thread-pool-executor"
thread-pool-executor {
# 2 for jdbc connections, 1 for config updates, 2 for num of cores
fixed-pool-size = 5
}
}
}
implicit val longBlockingIoContext =
system.dispatchers.lookup("contexts.long-blocking-io")
application.conf
app.scala
Aerospike
Optimizing your DB
13Confidential
Max message size -> Max aerospike record size
• Multiple Message per record
• Record updates -> Fragmentation
• Message per record -> Batch read
• Is it possible to reduce data size?
• Compression
• Protobuf/Avro
KnowYour Data
14Confidential
• write-block-size
• Larger -> more fragmentation if write size is small
• max-write-cache (pending write blocks)
• post-write-queue
• Defragmentation
• Update of record causes fragmentation
• One defrag thread per device
• Hardware - Are all instances created equal?
KnowYour Server
15Confidential
• Threading Model
• Does it use Async I/O?
• Completion handlers – Thread Pool? Event
Loop?
• Settings
• Retries
• Timeout
• Max Async Commands
• Max/Min thread pool size
• Batch Support
• Keep track of New Features
KnowYour Client
Questions?

Más contenido relacionado

La actualidad más candente

Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
Michael Stack
 
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentSpeed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
Hazelcast
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
StreamNative
 

La actualidad más candente (20)

January 2011 HUG: Kafka Presentation
January 2011 HUG: Kafka PresentationJanuary 2011 HUG: Kafka Presentation
January 2011 HUG: Kafka Presentation
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
Using Redis at Facebook
Using Redis at FacebookUsing Redis at Facebook
Using Redis at Facebook
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
 
Building Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxBuilding Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStax
 
What's new in MongoDB 2.6
What's new in MongoDB 2.6What's new in MongoDB 2.6
What's new in MongoDB 2.6
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentSpeed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Security Best Practices for your Postgres Deployment
Security Best Practices for your Postgres DeploymentSecurity Best Practices for your Postgres Deployment
Security Best Practices for your Postgres Deployment
 
Devoxx 2016 talk: Going Global with Nomad and Google Cloud Platform
Devoxx 2016 talk: Going Global with Nomad and Google Cloud PlatformDevoxx 2016 talk: Going Global with Nomad and Google Cloud Platform
Devoxx 2016 talk: Going Global with Nomad and Google Cloud Platform
 
Using apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at DatadogUsing apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at Datadog
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 

Similar a Scale and Throughput @ Clicktale with Akka

Life in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPagesLife in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPages
Ulrich Krause
 
Low Hanging Fruits In J EE Performance
Low Hanging Fruits In J EE PerformanceLow Hanging Fruits In J EE Performance
Low Hanging Fruits In J EE Performance
Alois Reitbauer
 

Similar a Scale and Throughput @ Clicktale with Akka (20)

Exchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store ChangesExchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store Changes
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Amazon Aurora TechConnect
Amazon Aurora TechConnect Amazon Aurora TechConnect
Amazon Aurora TechConnect
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisation
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
Life in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPagesLife in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPages
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
Degrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files SyndromeDegrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files Syndrome
 
Tech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DBTech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DB
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Low Hanging Fruits In J EE Performance
Low Hanging Fruits In J EE PerformanceLow Hanging Fruits In J EE Performance
Low Hanging Fruits In J EE Performance
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of usersScaling asp.net websites to millions of users
Scaling asp.net websites to millions of users
 
MYSQL
MYSQLMYSQL
MYSQL
 

Último

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 

Scale and Throughput @ Clicktale with Akka

Notas del editor

  1. Challenges: High Availability - any http 4xx or 5xx is visible by our customer's monitoring tool running on the browser Low Latency A response from server is required in order to start the recording process without delays. In addition, every browser request to server with latency is perceived by our customers IT as affecting User Experience Peaks - we have customers launching a web campaign, and all of sudden we have peaks, sometimes doubling the current traffic rate. Data Variance - different web pages, different users, different events, different page view duration Cost - reduce cost where possible - machines and storage
  2. Guidelines we adhere to in order to answer the challenges: Data Aggregate data before sending to server (but not too much…) Reduce data size in client as much as possible e.g. Compression Chunking data (max request size) Protect Client Release client as soon as possible - even before saving data to DB Client timeout - both at http level (Spray) and actor level (ask pattern timeout) Try…Catch and Actor Supervision all the way IO IO must never interfere with client requests Do everything async and as much as possible in parallel Different execution context than the actor system's Error Handling Supervision on all actors Try catch on request handling flow
  3. Actors are very good when working with a mutable state and concurrent access, but what about the flow shown here? A very common flow that consists of several async I/O operations (marked in green) Is an Actor based solution suitable here? Actors are not ideal when you need to do a sequence of I/O operations with decision logic in between Requries a lot of boiler plate code to pass and handle messages from and to the actor In this case, using Futures gives a more elegant simpler solution. A possible simplified solution with Futures is presented here.
  4. Another example. We first read configuration from Zookeeper, that stores the necessary connection strings to connecting to SQL DB, Rabbit and Aeropsike. Only then we can initialize everything and bind the service to listen to incoming requests. Again, I/O operations are async and are marked in green What is the most suitable solution here? This time the answer is not so clear. It is possible to do with futures, but Actors with Finite State Machines might also work here, since this is a state machine. Another interesting approach is to use Akka Streams with Graph Api – we could see this flow as a stream that handles a single “init” message.
  5. So when it is clear that Actor is our go to solution? We saw that State Machines is a good choice, especially with FSM. Wrapping a connection with an Actor is also a good idea, because you can manage reconnection (again, a state machine) and also buffer requests during disconnection (using stash). Queues are also natural to actors, since Actors are a queue, and you could easily implement the connection handling mentioned above. Keeping a process configuration and updating online configuration events – encapsulating a mutable state (configuration), and using Actor system event stream to notify configuration changes, where current configuration is an immutable object
  6. Does using a pool of actors actually improve throughput? As with all answers, it depends on your process and what your are doing. In our case, we have an I/O intensive service, but from profiling of our service under pressure, we saw that there was a direct relation between I/O and CPU consumption. CPU time was mostly spent on: Processing and parsing http requests (Spray) Aerospike Client Selector threads IO Completion So if the CPU is already working hard, and we are running on a quad core machine – is there any benefit to pool of actors? Our performance tests shows no significant difference between the two options.
  7. The higher the "throughput" value (default is 5) and the throughput-deadline-time (Default is negative for no deadline) - the better the throughput (you also reduce context switching of the dispatcher threads between actors), but it can cause starvation to other actors   Depends on your actors - if you have one main actor that handles all the traffic, and maybe other actors that run periodically - then it is best to put higher values to increase throughput on the expense of these actors.   But putting too high a value on throughput might cause latency on responses - so if we need to cap the maximum response time we need to balance it with throughput-deadline-time Thread Pool settings There is no good or wrong here. You need to take into account the number of cores, but also the number of threads other than the fork-join-executor that also use CPU. For example, in our service, there is also the Aerospike Thread pool being used for async command completion, and the async selector threads (usually the number of cores). These threads also compete with the dispatcher, so you must test this.
  8. Dedicated IO Thread Pool Execution Context Always a good idea. Also neutralizes the affect of blocking IO on CPU threading – so helps with testing best CPU thread pool settings
  9. The max size of your data will impact the required configuration for Aerospike. It will also affect the way you write data to Aerospike – would you keep multiple messages in a list\map bin or need to split the message between several records There is a tradeoff here – multiple records containing the message means multiple writes and multiple reads. Writing to the same record several time may cause fragmentation Try to reduce data size as much as possible for higher throughput in both server handling requests and Aerospike If data might be resent, need to handle duplicate detection – could use key already exists, or if using CDT – use map instead of list.
  10. Things you find out doing stress tests on the DB: Block Size - affects write size Depends on Data - small data - smaller block size is better Ten folds scale difference Write Queues – do you need to increase the default threshold? Defragmentation Updating existing records increases fragmentation (new block is allocated and record is copied on each update) and overloads the defrag thread How to improve defrag thread Cluster with more smaller instances better than Cluster with less stronger instances. Machine Scope: Partitions - thread per partition Direct-Attached SSD disks - not all instances were created equal Need to run Aerospike Certification Tool (ACT) that measures the SSD write and read times and select the instances with the best SSD