Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

5.635 visualizaciones

Publicado el

In this session, we provide a peek behind the scenes to learn about Amazon ElastiCache's design and architecture. See common design patterns with our Redis and Memcached offerings and how customers have used them for in-memory operations to reduce latency and improve application throughput. During this session, we review ElastiCache best practices, design patterns, and anti-patterns.

Publicado en: Tecnología
  • The ultimate acne system, Top ranked acne plan for download Unique clear skin strategies ★★★ https://bit.ly/2xJfKi2
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Michael Labib, Specialist Solutions Architect, AWS Brian Kaiser, CTO, Hudl November 29, 2016 DAT306 Amazon ElastiCache Deep Dive Best Practices and Usage Patterns
  2. 2. What to Expect from the Session • Why we’re here – In-Memory Data Stores • Amazon ElastiCache Overview • Usage Patterns • Scale with Redis Cluster • Best Practices • Hudl Presentation
  3. 3. In-Memory Data Stores
  4. 4. Why we’re here Amazon ElastiCache µs are the new ms
  5. 5. In-Memory Key-Value Store High-performance Redis and Memcached Fully managed; Zero admin Highly Available and Reliable Hardened by Amazon Amazon ElastiCache
  6. 6. Request Rate High Low Latency Low High Structure Low High Data Volume Low High Amazon RDS Amazon S3 Amazon Glacier Amazon CloudSearch and Amazon Elasticsearch Service Amazon DynamoDB Amazon ElastiCache HDFS
  7. 7. Memcached – Fast Caching Slab allocator In-memory key-value datastore Supports strings, objects Multi-threaded Insanely fast! Very established No persistence Open Source Easy to Scale
  8. 8. Redis – The In-Memory Leader Powerful ~200 commands + Lua scripting In-memory data structure server Utility data structures strings, lists, hashes, sets, sorted sets, bitmaps & HyperLogLogs Simple Atomic operations supports transactions Ridiculously fast! <1ms latency for most commands Highly Available replication Persistence Open Source
  9. 9. Redis Data Types - String • Binary safe. • Can contain a max value of 512 MB. • Great for storing Counters, HTML, Images, JSON objects, etc. valueKey
  10. 10. Key Redis Data Types - Set • A collection of unique unordered Strings values • Great for Deduplicating and Grouping related information value: 75 value: 1 value: 39 value: 63 value: 63 Duplicate!
  11. 11. Key Redis Data Types - Sorted Set • A collection of unique Strings values ordered by score • Great for Deduplicating, Grouping and Sorting related information value: mike score: 50 score: 75 value: dan value: emma score: 79 value: lina score: 123 value: luke score: 350
  12. 12. Key Redis Data Types - List HEAD value 1 value 2 value 3 TAIL • A collection of Strings stored in the order of their insertion • Push and Pop from head or tail of the list • Great for message queues and timelines
  13. 13. Key Redis Data Types - Hashes Field 1 value 1 • A collection of unordered fields and values • Great for representing objects • Ability to Add, GET, and DEL individual fields by Key Field 2 value 2 Field 3 value 3 Field 4 value 4
  14. 14. Memcached vs. Redis Redis Memcached Simple Cache offload to database pressure and lower latency Atomic counter support Data Sharding (supported in Redis 3.X) Need support for advanced datatypes such as Lists, Sets, Hashs Multi-threaded Architecture (takes full advantage of all CPU cores) Need ability to auto sort data to support Ranking or Leaderboards Need Pub/Sub capabilities High Availability and Failover Persistence Data volume max size 3.5 TiB 4.7 TiB + Max key/value size 512MB | 512MB 256 bytes | 1MB
  15. 15. Memcached vs. Redis Redis Memcached Simple Cache offload to database pressure and lower latency Atomic counter support Data Sharding (supported in Redis 3.X) Need support for advanced datatypes such as Lists, Sets, Hashs Multi-threaded Architecture (takes full advantage of all CPU cores) Need ability to auto sort data to support Ranking or Leaderboards Need Pub/Sub capabilities High Availability and Failover Persistence Data volume max size 3.5 TiB 4.7 TiB + Max key/value size 512MB | 512MB 256 bytes | 1MB
  16. 16. Amazon ElastiCache
  17. 17. Amazon ElastiCache Redis Multi-AZ with Automatic Failover Open-Source Compatible Fully Managed Enhanced Redis Engine Easy to Deploy, Use and Monitor No Cross-AZ Data Transfer Costs Extreme Performance at Cloud Scale ElastiCache - Customer Value
  18. 18. Enhanced Redis Engine – Hardened by Amazon Optimized Swap Memory •Mitigate the risk of increased swap usage during syncs and snapshots. Dynamic write throttling •Improved output buffer management when the node’s memory is close to being exhausted. Smoother failovers •Clusters recover faster as replicas avoid flushing their data to do a full re-sync with the primary. Amazon ElastiCache
  19. 19. Usage Patterns
  20. 20. Caching Clients Amazon ElastiCache Amazon DynamoDB Cache Reads/Writes DB Reads/Writes Elastic Load Balancing Amazon EC2 Amazon RDS  Better Performance - Microseconds Speed  Cost Effective  Higher Throughput - ~ 20M / RPS DB Reads/Writes AWS Lambda
  21. 21. Caching # Write Through def save_user(user_id, values): record = db.query("update users ... where id = ?", user_id, values) cache.set(user_id, record, 300) # TTL return record # Lazy Load def get_user(user_id): record = cache.get(user_id) if record is None: record = db.query("select * from users where id = ?", user_id) cache.set(user_id, record, 300) # TTL return record # App code save_user(17, {"name": “Big Mike"}) user = get_user(17) Amazon ElastiCache
  22. 22. Caching # Write Through def save_user(user_id, values): record = db.query("update users ... where id = ?", user_id, values) cache.set(user_id, record, 300) # TTL return record # Lazy Load def get_user(user_id): record = cache.get(user_id) if record is None: record = db.query("select * from users where id = ?", user_id) cache.set(user_id, record, 300) # TTL return record # App code save_user(17, {"name": “Big Mike"}) user = get_user(17) Amazon ElastiCache Write Through 1. Updated DB 2. SET in Cache Lazy Load 1. GET from cache. 2. If MISS get from DB 3. Then SET in Cache
  23. 23. 1) Install php, apache php memcache client e.g. yum install php apache php-pecl-memcache 2) Configure “php.ini” session.save_handler = memcache session.save_path= "tcp://node1:11211, tcp://node2:11211" 3) Configure “php.d/memcache.ini” memcache.hash_strategy = consistent memcache.allow_failover = 1 memcache.session_redundancy=3* 4) Restart httpd 5) Begin using Session Data: For situations where you need an external session store • Especially needed when using ASGs • Cache is optimal for high-volume reads PHP Example Session Caching https://github.com/mikelabib/elasticache-memcached-php-demo
  24. 24. IoT Device Data AWS IoT AWS IoT Device Amazon EC2 AWS Lambda Hot Data Amazon ElastiCache Amazon DynamoDB Longer Retention Data Lake Amazon S3 Amazon Glacier Cold Data Amazon Kinesis Firehose Amazon ElastiCache
  25. 25. Lambda Trigger for IoT Rule var redis = require("redis"); exports.handler = function(event, context) { client = redis.createClient("redis://your-redis-endpoint:6379"); multi = client.multi(); multi.zadd("SensorData", date, event.deviceId); multi.hmset(event.deviceId, "temperature", event.temperature, "deviceIP", event.deviceIP, "humidity", event.humidity, "awsRequestId", context.awsRequestId); multi.exec(function (err, replies) { if (err) { console.log('error updating event: ' + err); context.fail('error updating event: ' + err); } else { console.log('updated event ' + replies); context.succeed(replies); client.quit(); } }); } AWS Lambda Amazon ElastiCache AWS IoT
  26. 26. Lambda Trigger for IoT Rule var redis = require("redis"); exports.handler = function(event, context) { client = redis.createClient("redis://your-redis-endpoint:6379"); multi = client.multi(); multi.zadd("SensorData", date, event.deviceId); multi.hmset(event.deviceId, "temperature", event.temperature, "deviceIP", event.deviceIP, "humidity", event.humidity, "awsRequestId", context.awsRequestId); multi.exec(function (err, replies) { if (err) { console.log('error updating event: ' + err); context.fail('error updating event: ' + err); } else { console.log('updated event ' + replies); context.succeed(replies); client.quit(); } }); } AWS Lambda Amazon ElastiCache AWS IoT Transaction block start SET • Sorted Set • Hash Transaction block end https://github.com/mikelabib/IoT-Sensor-Data-and-Amazon-ElastiCache
  27. 27. Streaming Data Amazon ElastiCache Amazon EC2 AWS Lambda Amazon Kinesis Streams Amazon DynamoDB Hot Data Longer Retention Amazon ElastiCache Data Sources
  28. 28. Amazon Kinesis Analytics AWS Lambda Amazon Kinesis Streams Amazon Kinesis Streams Data Sources Amazon ElastiCache De-duplicate, Aggregate, Sort, Enrich, etc. cleansed stream Streaming Data Enrichment
  29. 29. Streaming Data Analytics Data Sources 1 Amazon Kinesis Streams Amazon EMR (Spark Streaming) Amazon ElastiCache Amazon S3 Amazon EC2 Amazon Redshift Spark Redis Connector Data Lake Amazon ElastiCache
  30. 30. ElastiCache Redis with Multi-AZ
  31. 31. Primary Availability Zone A Availability Zone B Replica Replica writes Use Primary Endpoint reads Use Read Replicas Auto-Failover  Chooses replica with lowest replication lag  DNS endpoint is same ElastiCache for Redis Multi-AZ ElastiCache for Redis ElastiCache for Redis ElastiCache for Redis Automatic Failover to a read replica in case of primary node failure ElastiCache Automates snapshots for persistence
  32. 32. ElastiCache with Redis Multi-AZ Region Availability Zone A Availability Zone B ElastiCache Cluster Auto Scaling Primary Read Replica
  33. 33. ElastiCache with Redis Multi-AZ Region Availability Zone A Availability Zone B Primary Read Replica Auto Scaling ElastiCache Cluster
  34. 34. ElastiCache with Redis Multi-AZ Region Availability Zone A Availability Zone B Primary Read Replica Auto Scaling ElastiCache Cluster
  35. 35. Get ReplicationGroup Replica endpoints public List getReplicationGroupEndpoints(String replicationGroupId) { List<String> replicaEndpoints = new ArrayList<String>(); if (replicationGroupId!=null) { try { DescribeReplicationGroupsRequest request = new DescribeReplicationGroupsRequest(); request.setReplicationGroupId(replicationGroupId); DescribeReplicationGroupsResult result = elastiCacheClient.describeReplicationGroups(request); Object[] nodeMembers; if (result != null) { for (ReplicationGroup replicationGroup : result.getReplicationGroups()) { for (NodeGroup node : replicationGroup.getNodeGroups()) { nodeMembers = node.getNodeGroupMembers().toArray(); for (int i = 0; i < nodeMembers.length; i++) { String nodeDescriptions = nodeMembers[i].toString(); if (nodeDescriptions.contains("replica")) { … Amazon ElastiCache
  36. 36. Get ReplicationGroup Replica endpoints public List getReplicationGroupEndpoints(String replicationGroupId) { List<String> replicaEndpoints = new ArrayList<String>(); if (replicationGroupId!=null) { try { DescribeReplicationGroupsRequest request = new DescribeReplicationGroupsRequest(); request.setReplicationGroupId(replicationGroupId); DescribeReplicationGroupsResult result = elastiCacheClient.describeReplicationGroups(request); Object[] nodeMembers; if (result != null) { for (ReplicationGroup : result.getReplicationGroups()) { for (NodeGroup node : replicationGroup.getNodeGroups()) { nodeMembers = node.getNodeGroupMembers().toArray(); for (int i = 0; i < nodeMembers.length; i++) { String nodeDescriptions = nodeMembers[i].toString(); if (nodeDescriptions.contains("replica")) { … Amazon ElastiCache DescribeReplicationGroups https://github.com/mikelabib/ElastiCacheRedisLoadBalancer
  37. 37. What’s New!
  38. 38. Features • Horizontal Scale of up to 3.5 TiB per cluster • Up to 20 million reads per second • Up to 4.5 million writes per second • Enhanced Redis Engine within ElastiCache • Up to 4x times failover than with Redis 2.8 • Cluster-level Backup and Restore • Fully Supported by AWS CloudFormation • Available in all AWS Regions New - October 2016 Redis 3.2 Support Amazon ElastiCache
  39. 39. • GEOADD locations 87.6298 41.8781 chicago • GEOADD locations 122.3321 47.6062 seattle • ZRANGE locations 0 -1 1) "chicago" 2) "seattle" • GEODIST locations chicago seattle mi "1733.4089" • GEORADIUS locations 122.4194 37.7749 1000 mi WITHDIST 1) 1) "seattle" 2) "679.4848" Geospatial Commands • GEOPOS locations chicago 1) 1) "87.62979894876480103 2) "41.87809901914020116" • GEORADIUSBYMEMBER locations chicago 2000 mi WITHDIST 1) 1) "chicago" 2) "0.0000" 2) 1) "seattle" 2) "1733.4089“ • GEOHASH locations chicago • ZREM locations seattle
  40. 40. Scaling with Redis Cluster
  41. 41. Setting up Redis Cluster - Console Cluster Mode
  42. 42. Redis Cluster – Automatic Client-Side Sharding S5 S1 S2 S4 S3 Client • 16384 hash slots per Cluster • Slot for a key is CRC16 modulo {key} • Slots are distributed across the Cluster into Shards • Developers must use a Redis cluster client! • Clients are redirected to the correct shard • Smart clients store a map Shard S1 = slots 0 – 3276 Shard S2 = slots 3277 – 6553 Shard S3 = slots 6554 – 9829 Shard S4 = slots 9830 – 13106 Shard S5 = slots 13107 - 16383
  43. 43. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster Redis Cluster – Architecture slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363 slots 10910 – 16363 Redis Cluster – Multi AZ A cluster consists of 1 to 15 shards
  44. 44. Availability Zone A slots 0 - 5454 Redis Cluster Redis Cluster – Architecture slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363 Shard ReplicaReplicaPrimary Each shard has a Primary Node and up to 5 replica nodes slots 5455 – 10909 slots 10910 – 16363
  45. 45. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster Redis Cluster – Architecture slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909 Shard ReplicaReplica Primary Each shard has a Primary Node and up to 5 replica nodes slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363 slots 10910 – 16363
  46. 46. Availability Zone A slots 0 - 5454 Redis Cluster Redis Cluster – Architecture slots 10910 – 16363 Availability Zone B Availability Zone C slots 10910 – 16363 slots 10910 – 16363 Shard Replica PrimaryReplica Each shard has a Primary Node and up to 5 replica nodes slots 5455 – 10909 slots 0 - 5454 slots 5455 – 10909 slots 0 - 5454 slots 5455 – 10909
  47. 47. Setting up Redis Cluster - Console Cluster Name
  48. 48. Setting up Redis Cluster - Console Redis Version
  49. 49. Setting up Redis Cluster - Console Instance
  50. 50. Setting up Redis Cluster - Console # of Shards
  51. 51. Setting up Redis Cluster - Console # of Replicas
  52. 52. Slots Distribution Setting up Redis Cluster - Console
  53. 53. Select AZs Setting up Redis Cluster - Console
  54. 54. Redis Failure Scenarios
  55. 55. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363 slots 10910 – 16363 Scenario 1: Single Primary Shard Failure
  56. 56. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster Scenario 1: Single Primary Shard Failure slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363 Mitigation: 1. Promote Read Replica Node 2. Repair Failed Node slots 10910 – 16363
  57. 57. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster Scenario 2: Two Primary Shards Fail slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 slots 10910 – 16363slots 10910 – 16363
  58. 58. Availability Zone A slots 0 - 5454 slots 5455 – 10909 Redis Cluster Scenario 2: Two Primary Shards Fail slots 10910 – 16363 Availability Zone B Availability Zone C slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454 Mitigation: Redis enhancements on ElastiCache • Promote Read Replica Nodes • Repair Failed Nodes slots 10910 – 16363slots 10910 – 16363
  59. 59. Migrating to a Cluster 1. Create new Cluster 2. Make snapshot of old CacheCluster 3. Restore snapshot to new Cluster 4. Update Client 5. Terminate old Cluster S5 S1 S2 S4 S3 Client Old < 3.2 Client
  60. 60. Enhanced CloudFormation • Support for Clusters • Delete Policy: set as Snapshot • Take one last backup before deleting • Replication Group tagging • Replication Group: add more replicas • User-defined resource identifiers • use Cluster name, Replication Group ID and Subnet group name to identify appropriate resources by assigning Physical Resource Identifier
  61. 61. { "AWSTemplateFormatVersion" : "2010-09-09", "Description" : "Test template for ReplicationGroup", "Resources" : { "BasicReplicationGroup" : { "Type" : "AWS::ElastiCache::ReplicationGroup", "Properties" : { "AutomaticFailoverEnabled" : true, "AutoMinorVersionUpgrade" : true, "CacheNodeType" : "cache.r3.large", "CacheSubnetGroupName" : { "Ref" : "CacheSubnetGroup" }, "Engine" : "redis", "EngineVersion" : "3.2", "NumNodeGroups" : "2", "ReplicasPerNodeGroup" : "2", "Port" : 6379, "PreferredMaintenanceWindow" : "sun:05:00-sun:09:00", "ReplicationGroupDescription" : "CFN RG test", "SecurityGroupIds" : [ { "Ref" : "RGSG" } ], "SnapshotRetentionLimit" : 5, "SnapshotWindow" : "10:00-12:00", CloudFormation: Infrastructure as Code AWS CloudFormation AWS CloudFormation Template Amazon ElastiCache
  62. 62. Best Practices
  63. 63. Redis • Avoid very short key names - while lengthening a name does adds bytes, it also simplifies app development when key names are predictable • Create a logical schema such as: [Object]:{value]. Use colon rather than “.” or “-” • Hashes, Lists, Sets are encoded to be much more efficient - use them! • Avoid small Strings values given the overhead of the data type. Otherwise use Hashes. • Avoid “KEYS” command and other long running commands • Max Key Size, Max Value Size = 512MB • List, Sets, Hashes size = 2^32-1
  64. 64. Architecting for Availability • Upgrade to the latest engine version – 3.2.4 • Set reserved-memory to 30% of total available memory • Swap usage should be zero or very low. Scale if not. • Put read-replicas in a different AZ from the primary • For important workloads use 2 read replicas per primary • Write to the primary, read from the read-replicas • Take snapshots from read-replicas • For Redis Cluster have odd number of shards.
  65. 65. Monitoring Your Cluster
  66. 66. Key ElastiCache CloudWatch Metrics • CPUUtilization • Memcached – up to 90% ok • Redis – divide by cores (ex: 90% / 4 = 22.5%) • SwapUsage low • CacheMisses / CacheHits Ratio low / stable • Evictions near zero • Exception: Russian doll caching • CurrConnections stable • Setup alarms with CloudWatch Metrics Whitepaper: http://bit.ly/elasticache-whitepaper
  67. 67. ElastiCache Modifiable Parameters • Maxclients: 65000 (unchangeable) • Use connection pooling • timeout – Closes a connection after its been idle for a given interval • tcp-keepalive – Detects dead peers given an interval • Databases: 16 (Default) • Logical partition • Reserved-memory: 0 (Default) • Recommended  50% of maxmemory to use before 2.8.22  30% after 2.8.22 – ElastiCache • Maxmemory-policy: • The eviction policy for keys when maximum memory usage is reached • Possible values: volatile-lru, allkeys-lru, volatile-random, allkeys-random, volatile-ttl, noeviction
  68. 68. Session Recap • Amazon ElastiCache provides the performance needed for demanding real-time applications • With a few lines of code, you can power your applications with an In-Memory datastore • Redis Cluster allows you to scale to terabytes of data and support millions of IOPS
  69. 69. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brian Kaiser, CTO 11/29/2016 ElastiCache @ Hudl
  70. 70. 130k teams
  71. 71. 4.5M active users
  72. 72. > 2B videos on S3
  73. 73. 35 hr/min of video
  74. 74. 15k API requests/sec
  75. 75. Web - Auto Scaling Group Routing layer AZ #1MongoDb Squad Cluster AZ #2MongoDb AZ #3MongoDb ELB Supporting Services
  76. 76. Couchbase/Memcached
  77. 77. public async Task<TResult> Get<TResult>(string key) where TResult : class { if (!_redisEnabled.Value) { return default(TResult); } var value = await _connection.Database.StringGetAsync(key); if (!value.HasValue || value.IsNull) { return default(TResult); } return _serializer.Deserialize<TResult>(value); }
  78. 78. public async Task Put(string key, object item, TimeSpan ttl) { if (!_redisEnabled.Value || string.IsNullOrWhiteSpace(key)) { return; } var data = _serializer.Serialize(item); await _connection.Database.StringSetAsync(key, data, ttl); }
  79. 79. public async Task<TResult> GetAndPut<TResult>(string key, TimeSpan ttl, Func<TResult> valueAccessor) where TResult : class { if(!_redisEnabled.Value) { return valueAccessor(); } var cachedValue = await Get<TResult>(key); if (cachedValue != null) { return cachedValue; } cachedValue = valueAccessor(); await Put(key, cachedValue, ttl); return cachedValue; }
  80. 80. Basic Object Caching Examples • Auth Token • User information • Team Information
  81. 81. The Feed
  82. 82. http://amzn.to/2fGS9nx
  83. 83. Distributed Locking S3 S3 MongoDb ElastiCache Workers
  84. 84. ElastiCache
  85. 85. ElastiCache Auto Scaling group Routing layer AZ #1MongoDb Squad Cluster Auto Scaling group AZ #2MongoDb Auto Scaling group AZ #3MongoDb Primary Replica Replica
  86. 86. ElastiCache – Redis Cluster
  87. 87. ElastiCache – Redis Cluster
  88. 88. Some best practices • Always Multi-AZ Replicas • Setup predictive alerts • Understand Eviction Policies • Learn Redis data structures and Big O complexity
  89. 89. Thank you!
  90. 90. Remember to complete your evaluations!

×