Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Real-time analytics as a service at King

218 visualizaciones

Publicado el

This talk introduces RBea, our scalable real-time analytics platform at King built on top of Apache Flink. The design goal of RBea is to make stream analytics easily accessible to game teams across King. RBea is powered by Apache Flink and uses the framework’s capabilities to it’s full potential in order to provide highly scalable stateful and windowed processing logic for the analytics applications. RBea provides a high-level scripting DSL that is more approachable to developers without stream-processing experience and uses code-generation to execute user-scripts efficiently at scale.

In this talk I will cover the technical details of the RBea architecture and will also look at what real-time analytics brings to the table from the business perspective. If time permits I will also give some outlook on our future plans to generalise and further grow the platform.

Publicado en: Datos y análisis
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Real-time analytics as a service at King

  1. 1. © King.com Ltd 2016 – Commercially confidential Real-Time Analytics as a Service at King Gyula Fóra Data Warehouse Engineer Apache Flink PMC Page 2
  2. 2. © King.com Ltd 2016 – Commercially confidential We make awesome mobile games 463 million monthly active users 30 billion events per day And a lot of data… Page 3 About King
  3. 3. © King.com Ltd 2016 – Commercially confidential From streaming perspective… Page 4 DB 30 billion events / day Analytics/Processing applications Terabytes of state DB DB
  4. 4. © King.com Ltd 2016 – Commercially confidential This is awesome, but… Page 5 End-users are often not Java/Scala developers Writing streaming applications is pretty hard Large state and windowing doesn’t help either Seems to work in my IDE, what next? We need a “turnkey” solution
  5. 5. © King.com Ltd 2016 – Commercially confidential The RBea platform Page 6 Powered by Apache Flink Scripting on the live streams Window aggregates Stateful computations Scalable + fault tolerant
  6. 6. © King.com Ltd 2016 – Commercially confidential RBea architecture Page 7 Events Output REST API RBEA web frontend Libraries http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg Data Scientists
  7. 7. © King.com Ltd 2016 – Commercially confidential RBea backend implementation Page 8 One stateful Flink job / game Stream events and scripts Events are partitioned by user id Scripts are broadcasted Output/Aggregation happens downstream S1 S2 S3 S4 S5 Add/Remove scripts Event stream Loop over deployed scripts and process CoFlatMap Output based on API calls
  8. 8. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 9 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() }
  9. 9. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 10 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Processing methods by annotation Event filter conditions Flexible argument list Code-generate Java classes => void processEvent(Event e, Context ctx);
  10. 10. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 11 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Output calls create Output events Output(KAFKA, “purchases”, “…” ) These events are filtered downstream and sent to a Kafka sink
  11. 11. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 12 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Aggregator calls create Aggregate events Aggr (MYSQL, 60000, “PurchaseCount”, 1) Flink window operators do the aggregation
  12. 12. © King.com Ltd 2016 – Commercially confidential Aggregators Page 13 long size = aggregate.getWindowSize(); long start = timestamp - (timestamp % size); long end = start + size; TimeWindow tw = new TimeWindow(start, end); Event time windows Window size / aggregator Script1 Script2 Window 1Window 2 NumGames Revenue W1: 8999 W2: 9001 W1: 200 W2: 300 MyAggregator W1: 10 W2: 5 Dynamic window assignment
  13. 13. © King.com Ltd 2016 – Commercially confidential Page 14 RBea physical plan
  14. 14. © King.com Ltd 2016 – Commercially confidential How do we run Flink Page 15 Standalone => YARN Few heavy streaming jobs => more and more RocksDB state backend Custom deployment/monitoring tools
  15. 15. © King.com Ltd 2016 – Commercially confidential Monitoring our jobs Page 16
  16. 16. © King.com Ltd 2016 – Commercially confidential King Streaming SDK (sneak preview) Page 17 Goal: Bridge the gap between RBea and Flink Build data pipelines from RBea processors Strict event format, limited set of operations Easy ”stream joins” and pattern matching Thin wrapper around Flink
  17. 17. © King.com Ltd 2016 – Commercially confidential King Streaming SDK (sneak preview) Page 18 Last<GameStart> lastGS = Last.semanticClass(GameStart.class); readFromKafka("event.myevents.log", "gyula") .keyByCoreUserID() .join(lastGS) .process((event, context) -> { context.getJoined(lastGS).ifPresent( lastGameStart -> { context.getAggregators() .getCounter("Purchases", MINUTES_10) .setDimensions(lastGameStart.getLevel()) .increment(); }); });
  18. 18. © King.com Ltd 2016 – Commercially confidential Closing Page 19 RBea makes streaming accessible to every data scientist at King We leverage Flink’s stateful and windowed processing capabilities People love it because it’s simple and powerful
  19. 19. Thank you!

×