Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

How to build a social network on serverless

854 visualizaciones

Publicado el

Many people are building different workloads using serverless technologies these days, but how would a non-trivial system such as a social network look like on serverless?

In this talk Yan will discuss his journey of migrating a social network startup to serverless, and how his team was able to improve performance, scalability and feature delivery using serverless technologies.

Yan will discuss how serverless technologies such as Lambda are used to implement each part of their system, including search, push notifications, timeline, user recommendations, and business intelligence. If you're wondering how serverless can be used to solve a wide variety of challenges in your business, this is the talk for you.

Publicado en: Tecnología
  • Sé el primero en comentar

How to build a social network on serverless

  1. 1. How to build a social network on #serverless Yan Cui @theburningmonk
  2. 2. Yan Cui http://theburningmonk.com @theburningmonk Principal Engineer @ Independent Consultant Instructor @ Instructor @ Advisor @
  3. 3. “Netflix for sports” offices in London, Leeds, Katowice and Amsterdam
  4. 4. available in Austria, Switzerland, Germany, Japan, Italy, Spain, Canada and USA available on 30+ platforms
  5. 5. ~1,000,000 concurrent viewers
  6. 6. “Netflix for sports” offices in London, Leeds, Katowice and Amsterdam We’re hiring! Visit engineering.dazn.com to learn more. follow @dazneng for updates about the engineering team.
  7. 7. apr, 2016
  8. 8. nov, 2016
  9. 9. WHY?
  10. 10. hey guys, vote on this post and I’ll announce a winner at 10PM tonight
  11. 11. 10PM traffic
  12. 12. 10PM traffic 70-100x
  13. 13. low utilisation to leave room for spikes EC2 scaling is slow, so scale earlier
  14. 14. lots of $$$ for unused resources
  15. 15. up to 30 mins for deployment deployment required downtime
  16. 16. features took months to develop
  17. 17. - Dan North “lead time to someone saying thank you is the only reputation metric that matters.”
  18. 18. WHY? to deliver better UX
  19. 19. WHY? to deliver better UX to deliver value faster
  20. 20. WHY? to deliver better UX to deliver value faster to be more cost efficient
  21. 21. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW?
  22. 22. what would good look like for us?
  23. 23. small fast zero downtime no lock-step deployments should be…
  24. 24. features should be… deployable independently loosely-coupled
  25. 25. we want to… minimise cost for unused resources
  26. 26. we want to… minimise cost for unused resources minimise ops effort
  27. 27. we want to… minimise cost for unused resources minimise ops effort reduce tech mess
  28. 28. we want to… minimise cost for unused resources minimise ops effort reduce tech mess deliver visible improvements faster
  29. 29. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices
  30. 30. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven
  31. 31. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless
  32. 32. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless WHAT? this talk!
  33. 33. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless WHAT? this talk!
  34. 34. 170 Lambda functions in prod
  35. 35. 95% cost saving vs. EC2
  36. 36. 15x no. of prod releases per month
  37. 37. 15x no. of prod releases per month (features were sometimes implemented on the same day)
  38. 38. time is a good fit
  39. 39. 1st function in prod! time is a good fit
  40. 40. ? time is a good fit 1st function in prod!
  41. 41. CI/CD?
  42. 42. CI/CD? testing?
  43. 43. CI/CD? testing? logging, monitoring, alerting?
  44. 44. time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting
  45. 45. 170 functions ? time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting
  46. 46. tracing?
  47. 47. tracing? config management?
  48. 48. tracing? config management? security?
  49. 49. 170 functions time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting tracing, config management, security
  50. 50. API Gateway and Kinesis Authentication & authorisation (IAM, Cognito) Testing Running & Debugging functions locally Log aggregation Monitoring & Alerting X-Ray Correlation IDs CI/CD Performance and Cost optimisation Error Handling Configuration management VPC Security Leading practices (API Gateway, Kinesis, Lambda) Canary deployments http://bit.ly/production-ready-serverless get 40% off with: ytcui
  51. 51. evolving the PLATFORM
  52. 52. Legacy Monolith Amazon Kinesis Step 1. ALL state changes!
  53. 53. events are an enabler for COMPOSABILITY
  54. 54. AWS LAMBDA is the...
  55. 55. Kinesis
  56. 56. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B
  57. 57. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B
  58. 58. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda
  59. 59. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT
  60. 60. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT
  61. 61. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT AWS Lambda AWS Lambda
  62. 62. build loosely-coupled system through events
  63. 63. service A service B service C service D bounded context bounded context
  64. 64. service A service B service C service D bounded context bounded context
  65. 65. service A service B service C service D
  66. 66. there are no silver bullets
  67. 67. service A service B service C service D
  68. 68. service A service B service C service D
  69. 69. service A service B service C service D update!
  70. 70. service A service B service C service Dbackward-compatible? update!
  71. 71. bounded context DON’T use events to orchestrate workflows within the same bounded context
  72. 72. bounded context adds unnecessary complexity to logging, tracing, and end-to-end reporting
  73. 73. bounded context the workflow doesn’t exist as a standalone concept, but as the sum of a series of loosely connected parts
  74. 74. Step Functions use Step Functions instead
  75. 75. Step Functions don’t forget to emit events from the workflow
  76. 76. Step Functions so others can react to state changes that happened as part of the workflow
  77. 77. “how do I organize my functions into code repositories?”
  78. 78. monorepo?
  79. 79. github repo
  80. 80. monorepo !== monostack
  81. 81. one repo per service?
  82. 82. github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  83. 83. CI/CD pipeline per service
  84. 84. functions are deployed together, as a stack
  85. 85. Strangler Pattern incrementally migrate the legacy system by gradually replacing pieces of functionalities to the new system
  86. 86. rebuilt search
  87. 87. Legacy Monolith Amazon Kinesis Amazon Lambda Amazon CloudSearch
  88. 88. Legacy Monolith Amazon Kinesis Amazon Lambda Amazon CloudSearchAmazon API Gateway Amazon Lambda
  89. 89. proxy requests from monolith to new service
  90. 90. new analytics pipeline
  91. 91. expensive ($3000/month) don’t understand our domain JS based query language
  92. 92. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery
  93. 93. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery 1 developer, 2 days design production (his 1st serverless project)
  94. 94. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery “nothing ever got done this fast at Skype!” - Chris Twamley
  95. 95. - Dan North “lead time to someone saying thank you is the only reputation metric that matters.”
  96. 96. $3000/month $0.03/month
  97. 97. Kinesis sink
  98. 98. Kinesis Kinesis Firehose batch Kinesis events
  99. 99. Kinesis Kinesis Firehose S3 data lake
  100. 100. Kinesis Kinesis Firehose S3 Glue analyze data schema, catalog data into tables
  101. 101. Kinesis Kinesis Firehose S3 Athena Glue query engine
  102. 102. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue visualization, dashboards
  103. 103. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue no code is required!
  104. 104. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue no code is required! pay-per-use!
  105. 105. user action business intelligence
  106. 106. user action business intelligence
  107. 107. Problem didn’t work…
  108. 108. Problem didn’t work… over-engineered…
  109. 109. try figure out what’s going on here…
  110. 110. Problem didn’t work… over-engineered… didn’t scale…
  111. 111. Rebuilt with Lambda
  112. 112. built-in retry and DLQ
  113. 113. built-in retry and DLQ avoid repeating expensive work of fetching mils of relationships
  114. 114. github repo timeline-api service: timeline-api provider: name: aws runtime: nodejs6.10 stage: dev region: us-east-1 functions: distribute-yubl: … undistribute-yubl: …
  115. 115. Problem didn’t work…
  116. 116. “it returns the first 30 users in the database, by creation time…”
  117. 117. Rebuilt with Lambda
  118. 118. BigQuery
  119. 119. BigQuery
  120. 120. grapheneDB BigQuery
  121. 121. grapheneDB BigQuery
  122. 122. grapheneDB BigQuery
  123. 123. grapheneDB BigQuery mostly built in one sleepless night…
  124. 124. Building a scalable notification system
  125. 125. expensive ($3000/month) don’t understand our domain JS based query language
  126. 126. all the analytics data is already in BigQuery powerful query engine
  127. 127. all the analytics data is already in BigQuery powerful query engine
  128. 128. Design Goals ad-hoc notifications
  129. 129. Design Goals ad-hoc notifications scheduled notifications
  130. 130. Design Goals ad-hoc notifications scheduled notifications A/B testing
  131. 131. Design Goals ad-hoc notifications scheduled notifications A/B testing scalable
  132. 132. Design Goals ad-hoc notifications scheduled notifications A/B testing scalable cost-effective
  133. 133. scheduled notifications
  134. 134. how to send notifications what to send
  135. 135. other processes can leverage this capability of sending notifications
  136. 136. why not SNS?
  137. 137. ad-hoc notifications
  138. 138. Oversight vs. Frictionless
  139. 139. Oversight vs. Frictionless don’t make life difficult for the marketing team
  140. 140. Oversight vs. Frictionless don’t make life difficult for the marketing team don’t let marketing team spam users
  141. 141. Oversight vs. Frictionless don’t make life difficult for the marketing team don’t let marketing team spam users driving usage/engagement maintaining user experience
  142. 142. Marketing work with BI on query request approval from CPO/CTO approver checks impact and tests message format send notifications
  143. 143. @theburningmonk theburningmonk.com github.com/theburningmonk

×