Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances for fault tolerance and load distribution. In this session, we go into detail about Elastic Load Balancing's configuration and day-to-day management, as well as its use in conjunction with Auto Scaling. We explain how to make decisions about the service's many customization choices. We also share best practices and useful tips for success.
5. Load Balancer used to route incoming requests to multiple EC2 instances.
ELB
EC2
Instance
EC2
Instance
EC2
Instance
6. Load balance over classic EC2 instances.
Support for public IP addresses only.
No control over the load balancer security group.
Load balance over EC2 instances within a VPC.
Support for both public and private IP addresses.
Full control over the load balancer security group.
Tightly integrated into the associated VPC and subnets.
EC2-Classic
EC2-VPC
8. HTTP/HTTPS
TCP/SSL
Incoming client connection bound to server connection
No header modification
Proxy Protocolprepends source and destination IP and ports to request
Round robin algorithm used for request routing
Connection terminated at the load balancer and pooled to the server
Headers may be modified
X-Forwarded-Forheader contains client IP address
Least outstandingrequests algorithm used for request routing
Sticky session support available
10. ELB
EC2
Instance
EC2
Instance
EC2
Instance
Health checks ensure that request traffic is shifted away from a failed instance.
Health Checks
11. Support for TCP and HTTP health checks.
Customize the frequency and failure
thresholds.
Must return a 2xx response.
Consider the depth and accuracy of your
health checks.
Health Checks
12. Idle timeoutsallow for connections to be closed by the load balancer when no longer in use.
13. Length of time that an idle connection should be kept open.
For both client and back-end connections.
Defaults to 60 seconds but can be set between 1 and 3,600 seconds.
Timeouts should decrease as you go
up the stack.
Idle Timeouts
24. Load balancer absorbs impact of DNS caching.
Eliminates imbalances in back-end instance utilization.
Requests distributed evenly across multiple
Availability Zones.
Check connection limits before enabling.
No additional bandwidth charge for cross-zone traffic.
Cross-Zone Load Balancing
25. Each load balancer domain may contains multiple records.
Round robin used to balance traffic between Availability Zones.
DNS records will to change over time; never
target IP addresses directly.
After being removed from DNS, IP addresses
are drained and quarantined for up to 7 days.
Understanding DNS
26. DNS caching by clients and ISPs can often cause clients to target a specific IP address or stop resolving at all.
Register a wildcard CNAME or ALIAS within Amazon Route 53.
// Create a wildcard CNAME or ALIAS in Route 53.
*.example.com ALIAS … elb-12345.us-east-1.elb.amazon.com
*.example.com CNAME elb-12345.us-east-1.elb.amazon.com
// prepend random content for each lookup made by the application.
PROMPT> dig +short 25a8ade5-6557-4a54-a60e-8f51f3b195d1.example.com
192.0.2.1
192.0.2.2
DNS Optimization
27. SSL Offloading
Support for both SSL and HTTPs is provided.
Support for latest ciphers and protocols including Elliptical Curve Ciphers and Perfect Forward Secrecy.
Ability to fully customize ciphers and protocols to be used by each load balancer.
SSL Negotiation Suites provided to remove complexity of selecting ciphers and protocols.
28. SSL Negotiation Policies
Provide selection of ciphers and protocols that adhere to the latest industry best practices.
Balance security best practices with client’s ability to negotiate a connection, generated using traffic to Amazon.com.
Released on a regular cadence or when new
vulnerabilities are published.
Default for all new load balancers.
29. POODLE Mitigation
Within 24 hours, 62% of load balancers migrated to the latest SSL Negotiation Policy, disabling SSLv3.
30. @awscloud Thank-you #AWS for making it so easy to prevent#sslv3 #poodleattack Only took about 3 clicks of my mouse.
“
”
@granticini
31. 13 CloudWatch metrics provided for each load balancer.
Provide detailed insight into the health of the load balancer and application stack.
CloudWatch alarms can be configured to notify or take action should any metric go outside of the acceptable range.
All metrics provided at the 1-minute granularity.
Amazon CloudWatch Metrics
32. HealthyHostCount
The count of the number of healthy instances in each Availability Zone.
Most common cause of unhealthy hosts are health check exceeding the allocated timeout.
Test by making repeated requests to the back- end instance from another EC2 instance.
View at the zonal dimension.
33. Latency
Measures the time elapsed in seconds after the request leaves the load balancer until the response is received.
Test by sending requests to the back-end instance from another instance.
Using min, average and max CloudWatch stats
provide upper and lower bounds for latency.
Debug individual requests using Access Logs.
34. SurgeQueue and Spillovers
Count of the number of requests that could not be sent to back-end instances.
Queue up to 1024 requests per load balancer
node, after which 503 errors will be returned.
Often caused by not being able to open
connections to the back-end instance.
Normally a sign of an under-scaled application.
35. CloudWatch and AutoScaling
All load balancer metrics can be used for AutoScaling.
Allow you to scale dynamically based on the load
balancers view of the application.
Important to consider all metrics when using
AutoScaling, may not be aware of resource
contention on another metric.
You may be at peak multiple times a day.
36. Provide detailed information on each request processed by the load balancer.
Includes request time, client IP address, latencies, request path, and server responses.
Delivered to an Amazon S3 bucket every 5 or 60 minutes.
Access Logs
37. Access Logs
ELB VPC
ELB
ELB
ELB
Amazon S3
Logs indexed by date but include the IP address of the load balancer node itself.
42. Mitigation
All load balancers scaled to handle loss of single Availability Zone.
Amazon Route 53 health checks shift traffic away from the failed Availability Zone.
Completed within 150 seconds.
No other external or control plane dependencies.
43. Isolation
Other zones must remain unaffected.
Avoid dependencies between zones.
Be careful of work generated as a result of the event.
Operating at reduced capacity but stable.
44. Health checkers and edge locations perform the same volume of activity whether endpoints are healthy or unhealthy.
Constant Work
time
System activity
Time to react
When nothing is failing, volume of API calls is zero. When failure occurs, volume of API calls spikes.
time
System activity
Time to react
Work on Failure
45. Restore Redundancy
Restoring the system back to full capacity.
Avoid putting additional load on the system by rushing this step.
Ensure that recovered resources are left in a consistent state.
Full recovered when done.