SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Concord: Simple & Flexible
Stream Processing on Apache Mesos
Shinji Kim
Co-founder, Concord Systems
@concord
@databythebay #datagrid
Overview
•  What is Stream Processing?
•  Today’s Stream Processing
•  Introducing Concord
1. Concepts & API
2. Job Topology Management
3. Operations, Toolings, Performance
4. Message Delivery Guarantees
•  Future Development Plans
Page 2
What is stream processing?
Page 3
•  Processing Data in motion
•  Sits between message queues and databases
•  Used for faster:
–  Data enrichment
–  Aggregation
–  Filtering / deduplication
Today’s Stream Processing
•  Faster MapReduce jobs à ends up running core
business logic on top
–  Fradulent click detection
–  Real-time budget updates
–  Trigger-based trading
•  Your stream processing jobs are more like microservices
•  Need support for services / application management:
Cluster mgmt, Monitoring, Debuggability
Page 4
Introducing Concord
Concord is a distributed stream processing framework
built in C++ on top of Apache Mesos, designed for
high-performance, real-time applications that require
flexibility & control.
Page 5
Introducing Concord
Page 6
Data	
  Sources	
   Data	
  Sinks	
  
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 7
Pub / Sub Operator Model
•  Composable jobs by Metadata
A	
   B	
  
words	
  Metadata(
Name=‘A’,
istreams=[],
ostreams=[‘words’])
Metadata(
Name=‘B’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[])
Page 8
C	
   Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.SHUFFLE],
ostreams=[])
Simple API in Multiple Languages
•  ProcessRecord, ProduceRecord, ProcessTimer
•  GetState, SetState backed by Rocksdb
•  API available in Python, Ruby, Go, Java/Scala, C++
B	
  Metadata(
Name=‘C’,
istreams=[‘words’,
StreamGrouping.GROUP_BY],
ostreams=[‘wordcount’])
Page 9
words	
   wordcount	
  
Key	
   Value	
  
Corgi	
   2	
  
Chiwawa	
   4	
  
Dashhound	
   5	
  
Useful for multiple teams to consume the same
streaming data in real-time
Page 10
Native Integration with Apache Mesos
Page 11
•  Dynamic resource
scheduling
•  Task Isolation
•  Task supervision
•  High Availability
Containerized Execution Environment
•  Horizontal scaling
•  Multi-tenancy
•  Hot code deployment &
dynamic topology
Page 12
Mesos	
  Agent	
  
RocksDB	
  
Concord is Flexible: Run-time deployment
Page 13
Concord is Flexible: Run-time deployment
Page 14
Concord is Flexible: Run-time deployment
Page 15
Concord is Flexible: Run-time deployment
Page 16
Concord supports Distributed Tracing
Page 17
Monitor all operator instances at glance
Page 18
Concord supports Transparent Debugging
[2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000:
traceId: -8816532120874703981,
parentId: 0, id: -6816766813334129096,
p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us
[2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001:
traceId: -4811311467074699790,
parentId: -7681059555040553620,
id: -1899872683843643522,
p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us
[2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req
[2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req
Page 19
Concord performs well at scale
•  Word count benchmark (1.13B msgs)
–  Concord: 500K QPS/node at 10ms/event
–  Storm: 16K QPS/node at 100ms/event
–  Spark Streaming: 100K QPS/node at 1s batch window
•  Server log processing (29G server log, ~260M msgs)
–  4 nodes, 8 vCPU, 32GB RAM each
–  Concord: 1M – 1.8M QPS
–  Spark Streaming: 72K – 2M QPS
•  Consistent performance
Page 20
Concord is designed for Predictability
•  As you scale, JVM reconfiguration and GC pauses are
inevitable (Framework GC vs. Application GC)
•  Cluster abstracted as CPU, Memory, Disk numbers à
cluster optimization & overall runtime
•  Fast Compile à Test à Deploy cycle without downtime
Page 21
Message Delivery Guarantees
Today: Fast > Complete or Perfect
•  Best-effort / at-most-once processing
–  When operator or node crashes, the local cache goes away
–  Automatically retries the failed operator (number of retries is
configurable)
–  Recommends implementing check mechanisms in operators
(e.g., Concord Kafka consumer)
Page 22
Message Delivery Guarantees
Soon: Fast + Complete > Perfect
•  In development for at-least-once with Kafka
–  Kafka acts as a message bus between operators
–  Kafka replays data from checked offset (data duplication)
Eventually: Fast + Complete + Perfect
•  Transactional datastore in design phase
Page 23
Future plans
•  “At least once” guarantee support with Kafka
•  DC/OS integration
•  More data source / data sink connector support
•  Higher level DSL
Page 24
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 25
•  Operator model that you can use multiple languages
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 26
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 27
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 28
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 29
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 30
•  Operator model that you can use multiple languages
à Fast development and iteration time for multiple
teams using the same data
•  Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
•  High performance at scale
à Predictable system for real-time applications
Concord: Simple & Flexible streaming application
framework on Apache Mesos
Page 31
•  Low-latency / Real-time applications:
–  Real-time fraud detection
–  Financial market data processing for real-time risks and triggers
–  Real-time campaign management for real-time bidding (RTB)
Thank You!
Get Started: http://concord.io
shinji@concord.io / @shinjikim
@concord
@databythebay #datagrid

Más contenido relacionado

La actualidad más candente

SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design PatternsGrokking VN
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
VMware Monitoring-Discover And Monitor Your Virtual Environment
VMware Monitoring-Discover And Monitor Your Virtual EnvironmentVMware Monitoring-Discover And Monitor Your Virtual Environment
VMware Monitoring-Discover And Monitor Your Virtual EnvironmentSite24x7
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouseAltinity Ltd
 
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
Thomas Lamirault_Mohamed Amine Abdessemed  -A brief history of time with Apac...Thomas Lamirault_Mohamed Amine Abdessemed  -A brief history of time with Apac...
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...Flink Forward
 
Apache Superset at Airbnb
Apache Superset at AirbnbApache Superset at Airbnb
Apache Superset at AirbnbBill Liu
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackRohit Sharma
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleGuy Nir
 
Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Continuous Application Security at Scale with IAST and RASP -- Transforming D...Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Continuous Application Security at Scale with IAST and RASP -- Transforming D...Jeff Williams
 
Security in CI/CD Pipelines: Tips for DevOps Engineers
Security in CI/CD Pipelines: Tips for DevOps EngineersSecurity in CI/CD Pipelines: Tips for DevOps Engineers
Security in CI/CD Pipelines: Tips for DevOps EngineersDevOps.com
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to TelegrafInfluxData
 
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...DataStax Academy
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
DevSecOps and the CI/CD Pipeline
 DevSecOps and the CI/CD Pipeline DevSecOps and the CI/CD Pipeline
DevSecOps and the CI/CD PipelineJames Wickett
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStormShu Sugimoto
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 

La actualidad más candente (20)

SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
VMware Monitoring-Discover And Monitor Your Virtual Environment
VMware Monitoring-Discover And Monitor Your Virtual EnvironmentVMware Monitoring-Discover And Monitor Your Virtual Environment
VMware Monitoring-Discover And Monitor Your Virtual Environment
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouse
 
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
Thomas Lamirault_Mohamed Amine Abdessemed  -A brief history of time with Apac...Thomas Lamirault_Mohamed Amine Abdessemed  -A brief history of time with Apac...
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
 
Apache Superset at Airbnb
Apache Superset at AirbnbApache Superset at Airbnb
Apache Superset at Airbnb
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life example
 
Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Continuous Application Security at Scale with IAST and RASP -- Transforming D...Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Continuous Application Security at Scale with IAST and RASP -- Transforming D...
 
Security in CI/CD Pipelines: Tips for DevOps Engineers
Security in CI/CD Pipelines: Tips for DevOps EngineersSecurity in CI/CD Pipelines: Tips for DevOps Engineers
Security in CI/CD Pipelines: Tips for DevOps Engineers
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
 
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
DevSecOps and the CI/CD Pipeline
 DevSecOps and the CI/CD Pipeline DevSecOps and the CI/CD Pipeline
DevSecOps and the CI/CD Pipeline
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStorm
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 

Similar a Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choiceAshnikbiz
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBMongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservicesMithun Arunan
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue MongoDB
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingBukhary Ikhwan Ismail
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyMongoDB
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatJessica DeVita
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast groupPasocoPteLtd
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Continuent
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los AngelesVMware Tanzu
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021Ieva Navickaite
 

Similar a Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016 (20)

Migrate to platform of your choice
Migrate to platform of your choiceMigrate to platform of your choice
Migrate to platform of your choice
 
Shaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDBShaping the Future of Travel with MongoDB
Shaping the Future of Travel with MongoDB
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservices
 
Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue Faster, Simpler, Better - MongoDB to the rescue
Faster, Simpler, Better - MongoDB to the rescue
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Docker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge Computing
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Dubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architectureDubbo and Weidian's practice on micro-service architecture
Dubbo and Weidian's practice on micro-service architecture
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data Strategy
 
DevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to HabitatDevOps LA Meetup Intro to Habitat
DevOps LA Meetup Intro to Habitat
 
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
130815 - Content Delviery Networks for the IEEE Singapore Broadcast group
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Robotics technical Presentation
Robotics technical PresentationRobotics technical Presentation
Robotics technical Presentation
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021
 

Último

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay May 2016

  • 1. Concord: Simple & Flexible Stream Processing on Apache Mesos Shinji Kim Co-founder, Concord Systems @concord @databythebay #datagrid
  • 2. Overview •  What is Stream Processing? •  Today’s Stream Processing •  Introducing Concord 1. Concepts & API 2. Job Topology Management 3. Operations, Toolings, Performance 4. Message Delivery Guarantees •  Future Development Plans Page 2
  • 3. What is stream processing? Page 3 •  Processing Data in motion •  Sits between message queues and databases •  Used for faster: –  Data enrichment –  Aggregation –  Filtering / deduplication
  • 4. Today’s Stream Processing •  Faster MapReduce jobs à ends up running core business logic on top –  Fradulent click detection –  Real-time budget updates –  Trigger-based trading •  Your stream processing jobs are more like microservices •  Need support for services / application management: Cluster mgmt, Monitoring, Debuggability Page 4
  • 5. Introducing Concord Concord is a distributed stream processing framework built in C++ on top of Apache Mesos, designed for high-performance, real-time applications that require flexibility & control. Page 5
  • 6. Introducing Concord Page 6 Data  Sources   Data  Sinks  
  • 7. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 7
  • 8. Pub / Sub Operator Model •  Composable jobs by Metadata A   B   words  Metadata( Name=‘A’, istreams=[], ostreams=[‘words’]) Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[]) Page 8 C   Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.SHUFFLE], ostreams=[])
  • 9. Simple API in Multiple Languages •  ProcessRecord, ProduceRecord, ProcessTimer •  GetState, SetState backed by Rocksdb •  API available in Python, Ruby, Go, Java/Scala, C++ B  Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[‘wordcount’]) Page 9 words   wordcount   Key   Value   Corgi   2   Chiwawa   4   Dashhound   5  
  • 10. Useful for multiple teams to consume the same streaming data in real-time Page 10
  • 11. Native Integration with Apache Mesos Page 11 •  Dynamic resource scheduling •  Task Isolation •  Task supervision •  High Availability
  • 12. Containerized Execution Environment •  Horizontal scaling •  Multi-tenancy •  Hot code deployment & dynamic topology Page 12 Mesos  Agent   RocksDB  
  • 13. Concord is Flexible: Run-time deployment Page 13
  • 14. Concord is Flexible: Run-time deployment Page 14
  • 15. Concord is Flexible: Run-time deployment Page 15
  • 16. Concord is Flexible: Run-time deployment Page 16
  • 17. Concord supports Distributed Tracing Page 17
  • 18. Monitor all operator instances at glance Page 18
  • 19. Concord supports Transparent Debugging [2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000: traceId: -8816532120874703981, parentId: 0, id: -6816766813334129096, p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us [2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001: traceId: -4811311467074699790, parentId: -7681059555040553620, id: -1899872683843643522, p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us [2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req [2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req Page 19
  • 20. Concord performs well at scale •  Word count benchmark (1.13B msgs) –  Concord: 500K QPS/node at 10ms/event –  Storm: 16K QPS/node at 100ms/event –  Spark Streaming: 100K QPS/node at 1s batch window •  Server log processing (29G server log, ~260M msgs) –  4 nodes, 8 vCPU, 32GB RAM each –  Concord: 1M – 1.8M QPS –  Spark Streaming: 72K – 2M QPS •  Consistent performance Page 20
  • 21. Concord is designed for Predictability •  As you scale, JVM reconfiguration and GC pauses are inevitable (Framework GC vs. Application GC) •  Cluster abstracted as CPU, Memory, Disk numbers à cluster optimization & overall runtime •  Fast Compile à Test à Deploy cycle without downtime Page 21
  • 22. Message Delivery Guarantees Today: Fast > Complete or Perfect •  Best-effort / at-most-once processing –  When operator or node crashes, the local cache goes away –  Automatically retries the failed operator (number of retries is configurable) –  Recommends implementing check mechanisms in operators (e.g., Concord Kafka consumer) Page 22
  • 23. Message Delivery Guarantees Soon: Fast + Complete > Perfect •  In development for at-least-once with Kafka –  Kafka acts as a message bus between operators –  Kafka replays data from checked offset (data duplication) Eventually: Fast + Complete + Perfect •  Transactional datastore in design phase Page 23
  • 24. Future plans •  “At least once” guarantee support with Kafka •  DC/OS integration •  More data source / data sink connector support •  Higher level DSL Page 24
  • 25. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 25 •  Operator model that you can use multiple languages
  • 26. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 26 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data
  • 27. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 27 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling
  • 28. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 28 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work
  • 29. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 29 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale
  • 30. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 30 •  Operator model that you can use multiple languages à Fast development and iteration time for multiple teams using the same data •  Dynamic topology, run-time deployment and scaling à Decoupled development & dev ops work •  High performance at scale à Predictable system for real-time applications
  • 31. Concord: Simple & Flexible streaming application framework on Apache Mesos Page 31 •  Low-latency / Real-time applications: –  Real-time fraud detection –  Financial market data processing for real-time risks and triggers –  Real-time campaign management for real-time bidding (RTB)
  • 32. Thank You! Get Started: http://concord.io shinji@concord.io / @shinjikim @concord @databythebay #datagrid