This document discusses best practices for large scale web development using Java. It covers typical web architectures with load balancers and backend servers. It discusses handling slow, failed, or overloaded backend requests using techniques like timeouts, circuit breakers, and parallel requests. It also discusses optimizing performance through caching, using Memcached, monitoring with JMX, and logging for troubleshooting. The document provides examples of using Java concurrency utilities like ExecutorService, Future, and CountDownLatch to implement parallel and asynchronous operations.
3. Typical Web Architecture
Backend A
Application
Instance Backend B
Backends may be
Application
Load Balancer Backend C slow, fast, highly
Instance
available or not
Application
Backend D
Instance
Backend E
4. Facing the network’s reality
• Some requests will be slow
Server / proxy overloaded, network traffic, ...
• Some requests won’t answer
Server application’s bugs, GC, connection rejected...
• Some requests will just fail
Server failure, network failure, OS and JVM pressure
5. Handling the network’s reality
• Timeout must be set and handled for every remote request
If API doesn’t offer it, ExecutorService and Future can help
• Use Circuit Breaker pattern
Avoid requesting an already overloaded service
• Setting a deadline for your answer may be helpful
Whatever happen, answer will be sent as is within N sec.
If mandatory goals aren’t achieved, return error.
6. Requests make the load
• Two requests instead of one double the load of the backend
Here, counting requests isn’t about optimization, it’s critical
• Cache must be sized with care
Cache Misses increase load on backends
7. Make requests in parallel
• Parallel requests reduce overall duration
Mandatory when backends are slow
• Thread pools make it possible easily
ExecutorService and Future do the job
• Thread pools also act as a throttle to shield a backend
No more than N concurrent requests
8. Make requests in parallel
D = Sum of requests durations
D = Max of requests durations
9. From separated thread pools to semaphores
• When you have a thousand threads, merging thread pools can help
Mutualize resources
• A semaphore can then do the throttling job
Limit the concurrent users of a resource
• Semaphore can also be tuned for live throttle tuning !
Allowing to slow down a requests stream to a dying backend
10. Serialized caches in Java heap
• Garbage Collector tuning can be time consuming
Especially when production environment is hard to simulate
• Serializing data structures in Java heap caches reduces pressure on GC
GC time complexity partly depends on amount of references
• Don’t use Java Standard Serialization, use Avro, Kryo, or ProtocolBuffer
Very low CPU overhead and a so compact format
11. Memcached instead of Java heap cache
• Memcached is a simple and efficient Unix daemon
Only two parameters to set : memory size and listening port
• Several Java clients available
All based on NIO !
12. Partitioned Memcached
Application Memcached
Memcached
Client
Memcached
Memcached
Application
Memcached
Client
Memcached
Application
Memcached R/W requests between the
Client
application and one of the
memcached instances (depending
on hashing)
13. Monitor everything
• A JMX attribute only costs an AtomicInteger and is priceless
AtomicInteger doesn’t cost synchronization
• Spring JMX offers efficient annotations
@ManagedResource, @ManagedAttribute
• Hyperic can do the aggregating job
But so awful to configure and use. SpringSource
promises to makes it better !
14. Logging with care
• With high traffic, strange things happen
Synchronization issues, connection losses, weird requests, ...
• These strange things may be hard to reproduce in development environment
Production environment’s behavior can’t be fully simulated
• Logs are the only way to track them
You’ll have a lot of logs to store, but it’s ok
16. What can be done with java.util.concurrent ?
• Parallel invocations, with or without dependencies between requests
ExecutorService with Future will do the job
• Making synchronous and asynchronous code collaboration possible
CountDownLatch, custom Future implementation, ...
• Blocking IO code in pooled threads mixed with NIO code
Wrapping Future, CountDownLatch, NIO callbacks, ...
17. Basic Parallel Requests
Servlet Thread A Thread B
Thread (from Pool) (from Pool)
executorService.submit()
future.get()
future.get()
18. Basic Parallel Requests
Servlet Thread A Thread B
Thread (from Pool) (from Pool)
executorService.submit()
Callable.call() Callable.call()
future.get()
future.get()
19. Basic Parallel Requests
Servlet Thread A Thread B
Thread (from Pool) (from Pool)
executorService.submit()
Callable.call() Callable.call()
future.get()
future.get()
20. Thread Pooled Requests + Memcached NIO Client
Servlet Thread A Thread B NIO Thread
Thread (from Pool) (from Pool) (Memcached)
invoke()
(custom
ExecutorService) get()
future.get()
future.get()
21. Thread Pooled Requests + Memcached NIO Client
Servlet Thread A Thread B NIO Thread
Thread (from Pool) (from Pool) (Memcached)
invoke()
(custom
ExecutorService) get()
submit()
(in read
callback)
future.get()
future.get()
22. Thread Pooled Requests + Memcached NIO Client
Servlet Thread A Thread B NIO Thread
Thread (from Pool) (from Pool) (Memcached)
invoke()
(custom
ExecutorService) get()
submit()
(in read
call() call() callback)
future.get()
future.get()
23. Thread Pooled Requests + Memcached NIO Client
Servlet Thread A Thread B NIO Thread
Thread (from Pool) (from Pool) (Memcached)
invoke()
(custom
ExecutorService) get()
submit()
(in read
call() call() callback)
future.get()
future.get()