Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Kakfa summit london 2019 - the art of the event-streaming app

135 visualizaciones

Publicado el

Have you ever imagined what it would be like to build a massively scalable streaming application on Kafka, the challenges, the patterns and the thought process involved? How much of the application can be reused? What patterns will you discover? How does it all fit together? Depending upon your use case and business, this can mean many things. Starting out with a data pipeline is one thing, but evolving into a company-wide real-time application that is business critical and entirely dependent upon a streaming platform is a giant leap. Large-scale streaming applications are also called event streaming applications. They are classically different from other data systems; event streaming applications are viewed as a series of interconnected streams that are topologically defined using stream processors; they hold state that models your use case as events. Almost like a deconstructed real-time database.

In this talk, I step through the origins of event streaming systems, understanding how they are developed from raw events to evolve into something that can be adopted at an organizational scale. I start with event-first thinking, Domain Driven Design to build data models that work with the fundamentals of Streams, Kafka Streams, KSQL and Serverless (FaaS).

Building upon this, I explain how to build common business functionality by stepping through the patterns for: – Scalable payment processing – Run it on rails: Instrumentation and monitoring – Control flow patterns Finally, all of these concepts are combined in a solution architecture that can be used at an enterprise scale. I will introduce enterprise patterns such as events-as-a-backbone, events as APIs and methods for governance and self-service. You will leave talk with an understanding of how to model events with event-first thinking, how to work towards reusable streaming patterns and most importantly, how it all fits together at scale.

Publicado en: Tecnología
  • Sé el primero en comentar

Kakfa summit london 2019 - the art of the event-streaming app

  1. 1. 1 The art of the event streaming application. streams, stream processors and scale Neil Avery, Office of the CTO, @avery_neil
  2. 2. 44
  3. 3. 55 Out of the tar pit, 2006
  4. 4. 66 “We believe that the major contributor to this complexity in many systems is the handling of state and the burden that this adds when trying to analyse and reason about the system.” Out of the tar pit, 2006
  5. 5. 77 What about Microservices?
  6. 6. 88 What are microservices? Microservices are a software development technique - a variant of the service-oriented architecture (SOA) architectural style that structures an application as a collection of loosely coupled services. https://en.wikipedia.org/wiki/Microservices
  7. 7. 99 structures an application as a collection of loosely coupled services. this is new!
  8. 8. 1010 So what went wrong?
  9. 9. 11 Making changes is risky
  10. 10. 12 Handling state is hard Cache? Embedded? Route to right instance?
  11. 11. 13 ● Scaling is hard ● Handling state is hard ● Sharing, coordinating is hard ● Run a database in each microservice - is hard What have we learned about microservices?
  12. 12. 14 We had it all wrong
  13. 13. 1515 We actually had some of it right
  14. 14. 1616 Immutability
  15. 15. 17 What’s the big idea?
  16. 16. 1818 Event driven architectures
  17. 17. 1919 aren’t new… ...but...
  18. 18. 2020 the world has changed
  19. 19. 21 New technology, requirements and expectations
  20. 20. 2222 Events FACT! SOMETHING HAPPENED!
  21. 21. Ad placement Examples... User signed up Item was sold Payment
  22. 22. Events Why do you care? Loose coupling, autonomy, evolvability, scalability, resilience, traceability, replayability EVENT-FIRST CHANGES HOW YOU THINK ABOUT WHAT YOU ARE BUILDING ...more importantly...
  23. 23. 25 Store events in ..a stream..
  24. 24. 26 Different types of event models ● Change Data Capture - CDC (database txn log) ● Time series (IoT, metrics) ● Microservices (domain events)
  25. 25. 27 Capture behavior
  26. 26. 28 Time travel user experience? how many users affected?has it happened before? Ask many questions of the same data, again and again time
  27. 27. 3131 old world : event-driven architectures new world: event-streaming architectures
  28. 28. 32 Stream processing Kafka Streams processor input events output events ...temporal reasoning... event-driven microservice
  29. 29. { user: 100 type: bid item: 389 cat: bikes/mtb region: dc-east } Partitions give you horizontal scale /bikes/ by item-id key# Key space {...} {...} {...} ConsumerTopic Partition Partition assignment
  30. 30. 3737 Stream processors are uniquely convergent. Data + Processing (sorry dba’s)
  31. 31. 3838 All of your data is a stream of events
  32. 32. 3939 stop...where is my database? (you said scaling data was hard)
  33. 33. 4040 Streams are your persistence model They are also your local database
  34. 34. 4141 The atomic unit for tackling complexity Stream processor input events output events ...or microservice or whatever...
  35. 35. 42 It’s pretty powerful Stream processor Stream processor Stream processor Topic: click-stream Interactive query CDC events from KTable CDC Stream partition partition partition CQRS Elastic
  36. 36. 4343 Stream processor == Single atomic unit It does one thing Like
  37. 37. 4444 We think in terms of function “Bounded Context” (dataflow - choreography)
  38. 38. 4545 Let’s build something…. A simple dataflow series of processors “Payment processing”
  39. 39. 4646 KPay looks like this: https://github.com/confluentinc/demo-scene/tree/master/scalable-payment-processing
  40. 40. 4747 Bounded context “Payments” 1. Payments inflight 2. Account processing [debit/credit] 3. Payments confirmed
  41. 41. 48 Payments bounded context choreography
  42. 42. 49 Payments system: bounded context [1] How much is being processed? Expressed as: - Count of payments inflight - Total $ value processed [2&3] Update the account balance Expressed as: - Debit - Credit [4] Confirm successful payment Expressed as: - Total volume today - Total $ amount today
  43. 43. 50 Payments system: AccountProcessor accountBalanceKTable = inflight.groupByKey() .aggregate( AccountBalance::new, (key, value, aggregate) -> aggregate.handle(key, value), accountStore); KStream<String, Payment>[] branch = inflight .map((KeyValueMapper<String, Payment, KeyValue<String, Payment>>) (key, value) -> { if (value.getState() == Payment.State.debit) { value.setStateAndId(Payment.State.credit); } else if (value.getState() == Payment.State.credit) { value.setStateAndId(Payment.State.complete); } return new KeyValue<>(value.getId(), value); }) .branch(isCreditRecord, isCompleteRecord); branch[0].to(paymentsInflightTopic); branch[1].to(paymentsCompleteTopic); https://github.com/confluentinc/demo-scene/blob/master/scalable-payment-processing/.../AccountProcessor.java KTable state (Kafka Streams)
  44. 44. 51 Payments system: AccountBalance public AccountBalance handle(String key, Payment value) { this.name = value.getId(); if (value.getState() == Payment.State.debit) { this.amount = this.amount.subtract(value.getAmount()); } else if (value.getState() == Payment.State.credit) { this.amount = this.amount.add(value.getAmount()); } else { // report to dead letter queue via exception handler throw new RuntimeException("Invalid payment received:" + value); } this.lastPayment = value; return this; } https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../model/AccountBalance.java
  45. 45. 52 Payments system: event model https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../io/confluent/kpay/payments
  46. 46. 5353 Bounded context “Payments” Is it enough? no
  47. 47. 5454 “It’s asynchronous, I don’t trust it” (some developer, 2018)
  48. 48. 5555 We only have one part of the picture ○ What about failures? ○ Upgrades? ○ How fast is it going? ○ What is happening - is it working?
  49. 49. 5656 Event-streaming provides ● Evolution ● Decoupling ● Bounded context modelling ● Composition (because of SoC)
  50. 50. 5757 Composition
  51. 51. 5858 Event-streaming pillars: 1. Business function (payment) 2. Instrumentation plane (trust) 3. … 4. ...
  52. 52. 59 Instrumentation Plane (trust) Goal: Prove the application is meeting business requirements Metrics: - Payments Inflight, Count and Dollar value - Payment Complete, Count and Dollar value
  53. 53. 60 Instrumentation Plane KStream<String, Payment> complete = builder.stream(paymentsCompleteTopic); statsKTable = complete .groupBy((key, value) -> "all-payments") .windowedBy(TimeWindows.of(ONE_MINUTE)) .aggregate( ThroughputStats::new, (key, value, aggregate) -> aggregate.update(value), completeWindowStore );
  54. 54. 61 Instrumentation Plane public ThroughputStats update(Payment payment) { totalPayments++; totalDollarAmount = totalDollarAmount.add(payment.getAmount()); maxLatency = Math.max(maxLatency, payment.getElapsedMillis()); minLatency = Math.min(minLatency, payment.getElapsedMillis()); if payment.getAmount().doubleValue() > largestPayment.getAmount().doubleValue()) { largestPayment = payment; } timestamp = System.currentTimeMillis(); return this; }
  55. 55. 62 Instrumentation Plane: Using IQ https://github.com/confluentinc/demo-scene/blob/master/scalable-payment-processing/../ThroughputStats.java
  56. 56. 6363 Event-streaming pillars: 1. Business function (payment) 2. Instrumentation plane (trust) 3. Control plane (coordinate) 4. ...
  57. 57. 64 Control Plane Goal: Provide mechanisms to coordinate system behavior Why? Recover from outage, DR, overload etc Applied: Flow control, start, pause, bootstrap, scale, gate and rate limit Model: - Status [pause, resume) - Gate processor [Status] - etc
  58. 58. 6565 Event-streaming pillars: 1. Business function (payment) 2. Instrumentation plane (trust) 3. Control plane (coordinate) 4. Operational plane (run)
  59. 59. 6666 Dependent on Control and Instrumentation planes Dataflow patterns ● Application logs ● Error/Warning logs ● Audit logs ● Lineage ● Dead-letter-queues /dead-letter/bid/region/processor /ops/logs/ca /ops/metric Stream processor Operational Plane
  60. 60. 67 Architectural pillars /payments/incoming PAY /payments/confirmed Core dataflow Control plane /control/state START STOP /control/status stream.filter() Instrumentation plane /payments/confirmed BIZ METRIC IQ IQ IQ /payments/dlq ERROR WARN IQ Operational plane
  61. 61. 6868 Payment system
  62. 62. 6969 Composition Patterns
  63. 63. Bounded context (dataflow) Choreography: - Capture business function as a bounded context - Events as API 2. Accounts [from] payment.incoming 3. Accounts [to] 4. Payment Conf’d 1. Payment Inflight payment.confirmed payment.inflight payment.inflight payment.complete payment.complete
  64. 64. Multiple Bounded contexts Choreography: - Chaining - Layering 2. Logistics payment.incoming 1. Payment payment.complete
  65. 65. Multiple Bounded contexts Orchestration ○ Captures workflow ○ Controls bounded context interaction ○ Business Process Model and Notation 2.0 (BPMN) (Zeebe, Apache Airflow) Source: https://docs.zeebe.io/bpmn-workflows/README.html
  66. 66. 7373 Composition patterns at scale Flickr: Dave DeGobbi
  67. 67. {faas} events as a backbone appappappapp Payments Department 2 {faas}appappappapp Department 3 Department 4 Pattern: Events as a backbone
  68. 68. {faas} What is going on here? appappappapp Payments Department 2 Patterns: Topic naming bikeshedding (uncountable) 1. Futile investment of time and energy in discussion of marginal technical issues. 2. Procrastination. https://en.wiktionary.org/wiki/bikeshedding Parkinson observed that a committee whose job is to approve plans for a nuclear power plant may spend the majority of its time on relatively unimportant but easy-to-grasp issues, such as what materials to use for the staff bikeshed, while neglecting the design of the power plant itself, which is far more important but also far more difficult to criticize constructively.
  69. 69. Patterns: Topic conventions Don’t 1. Use fields that change 2. Use fields if data is available elsewhere 3. Tie topic names to consumers or producers Do <message type>.<dataset name>.<data name> <app-context>.<message type>.<dataset name>.<data name> Source: Chris Riccomini https://riccomini.name/how-paint-bike-shed-kafka-topic-naming-conventions ● Logging ● Queuing ● Tracking ● etl/db ● Streaming ● Push ● user
  70. 70. 7777 What about that software crisis that started in 1968? “We believe that the major contributor to this complexity in many systems is the handling of state and the burden that this adds when trying to analyse and reason about the system.” Out of the tar pit, 2006
  71. 71. Our mental model: Abstraction as an Art Chained/Orchestrated Bounded contexts Stream processor Stream Event Pillars Business function Control plane Instrumentation Operations Bounded context
  72. 72. Key takeaway (state) Event streamingdriven microservices are the new atomic unit: 1. Provide simplicity (and time travel) 2. Handle state (via Kafka Streams) 3. Provide a new paradigm: convergent data and logic processing Stream processor
  73. 73. Key takeaway (complexity) ● Event-Streaming apps: model as bounded-context dataflows, handle state & scaling ● Patterns: Build reusable dataflow patterns (instrumentation) ● Composition: Bounded contexts chaining and layering ● Composition: Choreography and Orchestration
  74. 74. 81 This is just the beginning
  75. 75. 82 Questions? @avery_neil “Journey to event driven” blog 1. Event-first thinking 2. Programming models 3. Serverless 4. Pillars of event-streaming ms’s https://bit.ly/2tFfU84 or @avery_neil twitter profile
  76. 76. 83 @avery_neil

×