SlideShare una empresa de Scribd logo
1 de 14
PERFORMANCE
IMPROVEMENTS REPORT
Goal was:
Deliver short term tactical performance
improvements
● Fix common performance bottlenecks
● Introduce incremental architectural improvements
based on Proof-of-Concept
● => No business logic or radical architecture changes
● => Update software to incorporate modern
programming and architectural standards
Where did we start?
Version 1.9.5 running 8 PG and 8 L
This version cannot run 20PG and 15L
Mean latency: 70+ms (with long tail)
% Messages over 100ms: 49%
20 Currency pairs meant 40 processes running
JVM overload
Inefficient processor utilization – only 40%
Message latency characteristics
● Sub 10ms: 9.64%
● Sub 20ms: 31.69%
● Sub 40ms: 64.17%
● Sub 80ms: 85.34%
Long tail on distribution
Where are we now?
Version PERF running 20 PG and 15 L
Mean latency: 16ms
% Messages over 100ms: 0.029%
20 Currency pairs means 20 processes running
JVM not taxed
All processors utilized: 80%
Message latency characteristics
● Sub 10ms: 15.75%
● Sub 20ms: 77.53%
● Sub 40ms: 96.48%
● Sub 80ms: 99.89%
Comparare to http://www.lmax.com/execution-performance (can’t
guarantee latency 100%)
How did we test?
Instrumented code with Fixprotocol Inter Party Latency
LMP's, and recorded timing info
Run simulated price feed with constant and live-like rates
19 currency pairs
20 price groups (PG) and 15 layers (L) per pair for PERF
branch
8 PG and 8 L for 1.9.5.56 branch
100 spot updates/sec
20 fwd updates/sec
Latency Distribution (ms)
Distribution of throughput(msgs/s)
Performance Improvements
Common Improvements
● Eliminate sources of latency common to many
applications
● While some may have seemed trivial, they had
significant impact
Improvements based on the PoC
● Apply PoC architecture principles in key areas where
latency was measured
● Only tactical changes, not strategic
● Required careful measurements: bottlenecks turned out
to be in different places than previously thought
Common Performance Bottlenecks:
Price Object Marshalling
Replaced object marshalling
● Significant source of latency, large message sizes, and
garbage (object) creation
● Serialize-Deserialize cycle was performed at least three
times for every price
● Previously based on JDK serialization, replaced with
custom code
● Removed one cycle (more on that later)
Optimized the Price Object data structure
Common Performance Bottlenecks:
Logging
Price Engine logging levels were insane
● INFO level logging was performed over 1,500 times
● Most INFO level logging was redundant
Significant performance bottlenecks
● Disk writes, thread contention, object creation (GC)
● Logs could grow to GB size in minutes
Removed all but necessary logging
● Logs will need further work short term...
Code Review and Optimization
All PE code was reviewed for efficiency
● Re-work (tactical) but not re-write (strategic)
Improvements
● Timer scheduling replaced with a more efficient
approach
● Replace synchronization locks with CAS operations when
possible to reduce contention
● Replace inefficient cache access
● Numerous code tweaks
PoC Architectural Principles
Only distribute components when absolutely
necessary
● Challenge the myth that distributed components improve
throughput and latency
Parellelism (threads) may dramatically slow a
system down
● Contrary to old conventional wisdom
● Mechanical Sympathy has challenged this assumption
● Data contention, context switching often leads to data
duplication and GC
● A lot can be done in a single thread
Reduced Parallelism
Significant contention was eliminated in the Broadcast module
Excessive use of “in memory” producer/consumer queues
● Price objects put on queues for margining, forward calculation, and plugin delivery
● Multiple worker consumer threads pull from those queues and process prices
Queues written using synchronization primitives
● Very inefficient
● Contention between producers and consumers (put and take operations)
● Large number of worker threads lead to context switching
Queues were replaced with a highly efficient lock-free buffer
● Uses CAS operations instead of synchronization to dramatically reduce contention
● Only one consumer thread to reduce context switching
We attempted to eliminate buffers and queues altogether
● Make processing synchronous (and therefore remove contention)
● Turned out to be higher latency than using the lock-free buffers
– Likely because business logic is not optimised
Reduced Distribution
Everyone thought the bottleneck was Broadcast
● It turned out that bottlenecks existed in Broadcast, but there were
other equally significant sources of latency...
...Validator and TW
● One Validator process and one TW process per currency pair --> 20
currency pairs = 40 processes!
● Context switching, JMS latency, serialization overhead
Combined Validator and TW into a single processes
● Halved the number of processes
● Removed one serialization cycle
● Greatly simplified system management

Más contenido relacionado

La actualidad más candente

Chilinet
ChilinetChilinet
Chilinethjkim0
 
Topic and schema management-meetupberlin
Topic and schema management-meetupberlinTopic and schema management-meetupberlin
Topic and schema management-meetupberlinconfluent
 
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...Daniel Fireman
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookFaisal Siddiqi
 
Going Microserverless on Google Cloud @ mabl
Going Microserverless on Google Cloud @ mablGoing Microserverless on Google Cloud @ mabl
Going Microserverless on Google Cloud @ mablJoseph Lust
 
Going Microserverless on Google Cloud
Going Microserverless on Google CloudGoing Microserverless on Google Cloud
Going Microserverless on Google CloudJoseph Lust
 
What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31Ronald Hsu
 
The good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsThe good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsMohsiur Rahman
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephScyllaDB
 
Autonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafkaAutonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafkaIndrajeet Kumar
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservicesRonald Hsu
 
202104 technical challenging and our solutions - golang taipei
202104   technical challenging and our solutions - golang taipei202104   technical challenging and our solutions - golang taipei
202104 technical challenging and our solutions - golang taipeiRonald Hsu
 
Introduction to GraalVM
Introduction to GraalVMIntroduction to GraalVM
Introduction to GraalVMSHASHI KUMAR
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Till Rohrmann
 
Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?ScyllaDB
 
JBoss Developer Webinar jBPM5
JBoss Developer Webinar jBPM5JBoss Developer Webinar jBPM5
JBoss Developer Webinar jBPM5Kris Verlaenen
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVMSylvain Wallez
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQGeorge Teo
 

La actualidad más candente (20)

Chilinet
ChilinetChilinet
Chilinet
 
Topic and schema management-meetupberlin
Topic and schema management-meetupberlinTopic and schema management-meetupberlin
Topic and schema management-meetupberlin
 
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...
Improving Tail Latency of Stateful Cloud Services via GC Control and Load She...
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at Facebook
 
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
 
Going Microserverless on Google Cloud @ mabl
Going Microserverless on Google Cloud @ mablGoing Microserverless on Google Cloud @ mabl
Going Microserverless on Google Cloud @ mabl
 
Going Microserverless on Google Cloud
Going Microserverless on Google CloudGoing Microserverless on Google Cloud
Going Microserverless on Google Cloud
 
What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31
 
The good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsThe good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functions
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for Ceph
 
Autonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafkaAutonomous workload rebalancing in kafka
Autonomous workload rebalancing in kafka
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
 
202104 technical challenging and our solutions - golang taipei
202104   technical challenging and our solutions - golang taipei202104   technical challenging and our solutions - golang taipei
202104 technical challenging and our solutions - golang taipei
 
Introduction to GraalVM
Introduction to GraalVMIntroduction to GraalVM
Introduction to GraalVM
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
 
Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?
 
JBoss Developer Webinar jBPM5
JBoss Developer Webinar jBPM5JBoss Developer Webinar jBPM5
JBoss Developer Webinar jBPM5
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
 

Similar a BAXTER phase 1b

Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
 
IBM MQ - better application performance
IBM MQ - better application performanceIBM MQ - better application performance
IBM MQ - better application performanceMarkTaylorIBM
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 
Rate limits and Performance
Rate limits and PerformanceRate limits and Performance
Rate limits and Performancesupergigas
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Storyvanphp
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...HostedbyConfluent
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCErik Krogen
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...Timothy McCormick
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedShubham Tagra
 
Tokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfTokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfssuser2ae721
 
Application Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance CenterApplication Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance Centerinside-BigData.com
 
Performance Test Automation With Gatling
Performance Test Automation  With GatlingPerformance Test Automation  With Gatling
Performance Test Automation With GatlingKnoldus Inc.
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...InfluxData
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance ComputingLuciano Mammino
 
Parallel Batch Performance Considerations
Parallel Batch Performance ConsiderationsParallel Batch Performance Considerations
Parallel Batch Performance ConsiderationsMartin Packer
 

Similar a BAXTER phase 1b (20)

Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]
 
IBM MQ - better application performance
IBM MQ - better application performanceIBM MQ - better application performance
IBM MQ - better application performance
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
Callgraph analysis
Callgraph analysisCallgraph analysis
Callgraph analysis
 
Rate limits and Performance
Rate limits and PerformanceRate limits and Performance
Rate limits and Performance
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Tokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfTokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdf
 
Application Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance CenterApplication Profiling at the HPCAC High Performance Center
Application Profiling at the HPCAC High Performance Center
 
Gatling
Gatling Gatling
Gatling
 
Performance Test Automation With Gatling
Performance Test Automation  With GatlingPerformance Test Automation  With Gatling
Performance Test Automation With Gatling
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Parallel Batch Performance Considerations
Parallel Batch Performance ConsiderationsParallel Batch Performance Considerations
Parallel Batch Performance Considerations
 

Último

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

BAXTER phase 1b

  • 2. Goal was: Deliver short term tactical performance improvements ● Fix common performance bottlenecks ● Introduce incremental architectural improvements based on Proof-of-Concept ● => No business logic or radical architecture changes ● => Update software to incorporate modern programming and architectural standards
  • 3. Where did we start? Version 1.9.5 running 8 PG and 8 L This version cannot run 20PG and 15L Mean latency: 70+ms (with long tail) % Messages over 100ms: 49% 20 Currency pairs meant 40 processes running JVM overload Inefficient processor utilization – only 40% Message latency characteristics ● Sub 10ms: 9.64% ● Sub 20ms: 31.69% ● Sub 40ms: 64.17% ● Sub 80ms: 85.34% Long tail on distribution
  • 4. Where are we now? Version PERF running 20 PG and 15 L Mean latency: 16ms % Messages over 100ms: 0.029% 20 Currency pairs means 20 processes running JVM not taxed All processors utilized: 80% Message latency characteristics ● Sub 10ms: 15.75% ● Sub 20ms: 77.53% ● Sub 40ms: 96.48% ● Sub 80ms: 99.89% Comparare to http://www.lmax.com/execution-performance (can’t guarantee latency 100%)
  • 5. How did we test? Instrumented code with Fixprotocol Inter Party Latency LMP's, and recorded timing info Run simulated price feed with constant and live-like rates 19 currency pairs 20 price groups (PG) and 15 layers (L) per pair for PERF branch 8 PG and 8 L for 1.9.5.56 branch 100 spot updates/sec 20 fwd updates/sec
  • 8. Performance Improvements Common Improvements ● Eliminate sources of latency common to many applications ● While some may have seemed trivial, they had significant impact Improvements based on the PoC ● Apply PoC architecture principles in key areas where latency was measured ● Only tactical changes, not strategic ● Required careful measurements: bottlenecks turned out to be in different places than previously thought
  • 9. Common Performance Bottlenecks: Price Object Marshalling Replaced object marshalling ● Significant source of latency, large message sizes, and garbage (object) creation ● Serialize-Deserialize cycle was performed at least three times for every price ● Previously based on JDK serialization, replaced with custom code ● Removed one cycle (more on that later) Optimized the Price Object data structure
  • 10. Common Performance Bottlenecks: Logging Price Engine logging levels were insane ● INFO level logging was performed over 1,500 times ● Most INFO level logging was redundant Significant performance bottlenecks ● Disk writes, thread contention, object creation (GC) ● Logs could grow to GB size in minutes Removed all but necessary logging ● Logs will need further work short term...
  • 11. Code Review and Optimization All PE code was reviewed for efficiency ● Re-work (tactical) but not re-write (strategic) Improvements ● Timer scheduling replaced with a more efficient approach ● Replace synchronization locks with CAS operations when possible to reduce contention ● Replace inefficient cache access ● Numerous code tweaks
  • 12. PoC Architectural Principles Only distribute components when absolutely necessary ● Challenge the myth that distributed components improve throughput and latency Parellelism (threads) may dramatically slow a system down ● Contrary to old conventional wisdom ● Mechanical Sympathy has challenged this assumption ● Data contention, context switching often leads to data duplication and GC ● A lot can be done in a single thread
  • 13. Reduced Parallelism Significant contention was eliminated in the Broadcast module Excessive use of “in memory” producer/consumer queues ● Price objects put on queues for margining, forward calculation, and plugin delivery ● Multiple worker consumer threads pull from those queues and process prices Queues written using synchronization primitives ● Very inefficient ● Contention between producers and consumers (put and take operations) ● Large number of worker threads lead to context switching Queues were replaced with a highly efficient lock-free buffer ● Uses CAS operations instead of synchronization to dramatically reduce contention ● Only one consumer thread to reduce context switching We attempted to eliminate buffers and queues altogether ● Make processing synchronous (and therefore remove contention) ● Turned out to be higher latency than using the lock-free buffers – Likely because business logic is not optimised
  • 14. Reduced Distribution Everyone thought the bottleneck was Broadcast ● It turned out that bottlenecks existed in Broadcast, but there were other equally significant sources of latency... ...Validator and TW ● One Validator process and one TW process per currency pair --> 20 currency pairs = 40 processes! ● Context switching, JMS latency, serialization overhead Combined Validator and TW into a single processes ● Halved the number of processes ● Removed one serialization cycle ● Greatly simplified system management