Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Modern Fast Streaming Data

468 visualizaciones

Publicado el

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1j8YoCp.

Todd Montgomery challenges some of the common myths and misconceptions about high performance streaming data, and takes a look at what is really possible today. Filmed at qconnewyork.com.

Todd Montgomery is a networking hacker who has researched, designed, and built numerous protocols, messaging-oriented middleware systems, and real-time data systems, done research for NASA, contributed to the IETF and IEEE, and cofounded two startups. He currently works for Kaazing as Chief Architect. @toddlmontgomery

Publicado en: Tecnología
  • Sé el primero en comentar

Modern Fast Streaming Data

  1. 1. Modern Fast Streaming Data Todd L. Montgomery @toddlmontgomery
  2. 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /myths-data-streaming
  3. 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  4. 4. Why Should We Care? Myths & Misconceptions You can’t escape the Math Technologies & Techniques
  5. 5. Why Should We Care?
  6. 6. Human Knowledge is now doubling every year* * by discipline, 12-18 months
  7. 7. Fueled by Technology
  8. 8. Middle Ages - 1500 yrs Renaissance - 250 yrs Industrial Revolution - 150 yrs WWII - 25 years Buckminster Fuller - Critical Path 1981
  9. 9. IoT
  10. 10. IoT Ubiquitous Computing
  11. 11. In the near future, Human Knowledge could double every 72 hours
  12. 12. What this could mean for our systems…
  13. 13. Updates/Sec = Devices * Frequency * Market Share Either ingest or streaming. 2x for Request/Response
  14. 14. Updates/Sec = Devices * Frequency * Market Share 9 Billion (Today) 50 Billion by 2020 (Cisco) 26 Billion by 2020 (Smartphone/Tablet - Gartner) 75 Billion by 2020 (Morgan Stanley)
  15. 15. Updates/Sec = 50 Billion * 6/min * 1% = 50 Million/sec
  16. 16. Bandwidth = 50 Billion * 6/min * 1% * 200 bytes = 9.3 GB/s (74.5 Gb/s)
  17. 17. And… Geographic Distribution 30% 15% 10% 15% 20% 10%
  18. 18. Social & Societal demands will require processing an intense stream of data in real-time
  19. 19. Myths & Misconceptions
  20. 20. Excuses, Excuses!
  21. 21. Myth (CPUs, Storage, Networks) are not capable of processing in real-time* * for some unknown, unquantified data volume
  22. 22. Accumulated Improvement Time Network Bandwidth Response Time Storage Capacity CPU Cores Memory Capacity
  23. 23. http://en.wikipedia.org/wiki/Instructions_per_second Year Processor MIPS 1974 Intel 8080 0.29 1982 Intel 286 1.28 1993 PowerPC 601 157 2003 Pentium 4 Extreme 9,726 2008 Intel Core i7 920 (Quad) 82,300 2011 Intel Core i7 2600K (4/8) Sandy Bridge 128,300 2014 Intel Core i7 5960x (8/16) Haswell 298,190
  24. 24. Raspberry Pi 2 (Quad) 1,186 MIPS! http://en.wikipedia.org/wiki/Instructions_per_second
  25. 25. DDRSSD PCIe - 3 100 GbE … OmniPath
  26. 26. 1 thread of awesome > 128 cores of so-so http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html http://blog.acolyer.org/2015/06/05/scalability-but-at-what-cost/
  27. 27. Misconception Data needs to come to rest to be processed
  28. 28. Data at rest is a liability
  29. 29. Data at rest is a liability MDMILM WarehouseETL
  30. 30. Why would you want transient data at rest?
  31. 31. But what about sort? …REALLY?!
  32. 32. You can’t escape the Math
  33. 33. … and you don’t want to The Math will guide you to a solution
  34. 34. "AmdahlsLaw" by Daniels220 at English Wikipedia - Own work based on: File:AmdahlsLaw.png. Licensed under CC BY-SA 3.0 via Wikimedia Commons
  35. 35. Setup & Scheduling Work Unit Work UnitWork UnitWork Unit Post Processing
  36. 36. Setup & Scheduling Work Unit Work UnitWork UnitWork Unit Post Processing Contention Contention
  37. 37. Contention isn’t the biggest enemy
  38. 38. Coherence is!
  39. 39. Universal Scalability Law 0 2 4 6 8 10 12 14 16 18 20 1 2 4 8 16 32 64 128 256 512 1024 Speedup Processors Amdahl USL
  40. 40. Setup & Scheduling Work Unit Work UnitWork UnitWork Unit Post Processing Contention Contention Contention + Coherence Contention + Coherence
  41. 41. Up Front Partitioning Work Unit Work UnitWork UnitWork Unit
  42. 42. … and more Queuing Theory ⭐ Complexity Theory CAP Theorem
  43. 43. Technologies & Techniques
  44. 44. The Essence of Architecture Data Structures Protocols of Interaction Mechanical Sympathy
  45. 45. Understanding is essential
  46. 46. Accumulated Improvement Time Network Bandwidth Response Time Storage Capacity CPU Cores Memory Capacity
  47. 47. Accumulated Improvement Time Network Bandwidth Response Time Storage Capacity CPU Cores Memory Capacity Batching…
  48. 48. Technique Smart Batching (Natural Batching) http://mechanical-sympathy.blogspot.com/2011/10/smart-batching.html
  49. 49. Resource
  50. 50. Resource Ring Buffer
  51. 51. Batching Thread Resource Pull off as much waiting data as possible
  52. 52. Single Writer Principle Avoid Resource Contention Batching only when needed Rate Decoupling Back Pressure
  53. 53. Techniques Freedom! Lock-Free, Wait-Free http://en.wikipedia.org/wiki/Non-blocking_algorithm
  54. 54. Words Matter
  55. 55. Obstruction-Freedom Partially completed operations aborted & changes made rolled back
  56. 56. Lock-Freedom Individual thread may starve, but guaranteed system-wide throughput Lock-Free is Obstruction-Free
  57. 57. Wait-Freedom Starvation free and guaranteed system-wide throughput Wait-Free is Lock-Free
  58. 58. These properties are awesome! Who wouldn’t want them?
  59. 59. System-wide properties start at the lowest level
  60. 60. Essence Just because we could take an action right now, doesn’t mean we should
  61. 61. Technology CRDTs http://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
  62. 62. 0 0 0 0 Node 0 1 2 N Value sum(0,N) = 0 …
  63. 63. 0 1 0 0 Node 0 1 2 N Value sum(0,N) = 1 …
  64. 64. 1 1 0 0 Node 0 1 2 N Value sum(0,N) = 2 …
  65. 65. Gossip for visibility
  66. 66. [2] = 0 [1] = 2 [0] = 4[N] = 0 4 2 0 0… 4200 Shared View
  67. 67. Technology Append-only Data Structures https://github.com/real-logic/Aeron
  68. 68. Header Message Log
  69. 69. Header Message Header Message Header Message Log
  70. 70. Efficiently Replicating an Append-only Log
  71. 71. What If…? The Data Structure could be directly sent to the “network”?
  72. 72. Header Message
  73. 73. Header Message Position in Log Length
  74. 74. Header Message Position in Log Length Version/Flags Type etc. +
  75. 75. Header Message Fragment 0
  76. 76. Header Message Header Message Fragment 0
  77. 77. Header Message Header Message Header Message Header Message Fragment 0
  78. 78. Header Message Header Message Header Message Header Message Fragment 0 Fragment 1
  79. 79. Header Message Header Message Header Message Header Message Header Message Header Message Fragment 0 Fragment 1
  80. 80. Natural for broadcast replication
  81. 81. In Closing…
  82. 82. A flood of data is coming, Many say it is already here, How will you deal with it?
  83. 83. Ever seen a PitBull drink from a Fire Hydrant?
  84. 84. It won’t give up… Be the Pitbull at the Fire Hydrant!
  85. 85. @toddlmontgomery Questions? • Aeron https://github.com/real-logic/Aeron • SlideShare http://www.slideshare.com/toddleemontgomery • Twitter @toddlmontgomery Thank You!
  86. 86. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/myths- data-streaming

×