SlideShare una empresa de Scribd logo
1 de 28
building a system for machine and
event-oriented data
e. sammer | @esammer | may 28, 2015
velocity, santa clara, 2015
© 2015 Rocana, Inc. All Rights Reserved.
context
© 2015 Rocana, Inc. All Rights Reserved.
me
3
• i work here: rocana – cto and cofounder
• i used to work here: cloudera (‘10 – ’14), magnetic, experian, …
• i do this: systems / distributed systems (storage, query, messaging, ...)
• i wrote this:
if marie (my editor) is in the room,
yes, i’m hard at work on the second
edition. honest. hi marie.
© 2015 Rocana, Inc. All Rights Reserved.
what we do
4
• we build a system for the operation of modern data centers
• triage and diagnostics, exploration, trends, advanced analytics of complex
systems
• our data: logs, metrics, human activity, anything that occurs in the data center
• “enterprise software” (i.e. we build for others.)
• today: how we built what we built
coffee!
selling
speaking
helpful venn diagram of
today’s events
© 2015 Rocana, Inc. All Rights Reserved.
our typical customer use cases
5
• ~100K events / sec (8.6B events / day), sub-second end to end latency, full
fidelity retention, critical use cases
• quality of service - “are credit card transactions happening fast enough?”
• fraud detection - “detect, investigate, prosecute, and learn from fraud.”
• forensic diagnostics - “what really caused the outage last friday?”
• security - “who’s doing what, where, when, why, and how, and is that ok?”
• user behavior - ”capture and correlate user behavior with system performance,
then feed it to downstream systems in realtime.”
© 2015 Rocana, Inc. All Rights Reserved.
depth: 3 meters
© 2015 Rocana, Inc. All Rights Reserved.
high level architecture
7
© 2015 Rocana, Inc. All Rights Reserved.
guarantees
8
• no single point of failure exists
• all components scale horizontally[1]
• data retention and latency is a function of cost, not tech[1]
• every event is delivered provided no more than N - 1 failures occur (where N is
the kafka replication level)
• all operations, including upgrade, are online[2]
• every event is (or appears to be) delivered exactly once[3]
[1] we’re positive there’s a limit, but thus far it has been cost.
[2] from the user’s perspective, at a system level.
[3] when queried via our UI. lots of details here.
© 2015 Rocana, Inc. All Rights Reserved.
events
© 2015 Rocana, Inc. All Rights Reserved.
modeling our world
10
• everything is an event
• each event contains a timestamp, type, location, host, service, body, and type-
specific attributes (k/v pairs)
• build specialized aggregates as necessary - just optimized views of the data
© 2015 Rocana, Inc. All Rights Reserved.
event schema
11
{
ts: long,
event_type_id: int,
location: string,
host: string,
service: string,
body: [ null, string ],
attributes: map<string>
}
© 2015 Rocana, Inc. All Rights Reserved.
event types
12
• some event types are standard
– syslog, http, log4j, generic text record, …
• users define custom event types
• producers populate event type
• transformations can turn one event type into another
• event type metadata tells downstream systems how to interpret body and
attributes
© 2015 Rocana, Inc. All Rights Reserved.
ex: generic syslog event
13
event_type_id: 100, // rfc3164, rfc5424 (syslog)
body: … // raw syslog message bytes
attributes: { // extracted fields from body
syslog_message: “DHCPACK from 10.10.0.1 (xid=0x45b63bdc)”,
syslog_severity: “6”, // info severity
syslog_facility: “3”, // daemon facility
syslog_process: “dhclient”,
syslog_pid: “668”,
…
}
© 2015 Rocana, Inc. All Rights Reserved.
ex: generic http event
14
event_type_id: 102, // generic http event
body: … // raw http log message bytes
attributes: {
http_req_method: “GET”,
http_req_vhost: “w2a-demo-02”,
http_req_path: “/api/v1/search?q=service%3Asshd&p=1&s=200”,
http_req_query: “q=service%3Asshd&p=1&s=200”,
http_resp_code: “200”,
…
}
© 2015 Rocana, Inc. All Rights Reserved.
consumers
© 2015 Rocana, Inc. All Rights Reserved.
consumers
16
• …do most of the work
• parallelism
• kafka offset management
• message de-duplication
• transformation (embedded library)
• dead letter queue support
• downstream system knowledge
© 2015 Rocana, Inc. All Rights Reserved.
consumers
17
• …do most of the work
• parallelism
• kafka offset management
• message de-duplication
• transformation (embedded library)
• dead letter queue support
• downstream system knowledge
© 2015 Rocana, Inc. All Rights Reserved.
inside a consumer
18
© 2015 Rocana, Inc. All Rights Reserved.
metrics and time series
© 2015 Rocana, Inc. All Rights Reserved.
aggregation
20
• mostly for time series metrics
• two halves: on write and on query
• data model: (dimensions) => (aggregates)
• on write
– reduce(a: A, b: A): B over window
– store “base” aggregates, all associative and commutative
• on query
– perform same aggregate or build non-associative/commutative aggregates
– group by the same dimensions
– we use SQL (Impala)
© 2015 Rocana, Inc. All Rights Reserved.
aside: late arriving data (it’s a thing)
21
• never trust a (wall) clock
• producer determines observation time, rest of the system uses this always
• data that shows up late always processed according to observation time
• aggregation consequences
– the same time window can appear multiple times
– solution: aggregate every N seconds, potentially generating multiple aggregates for
the same time bin
• this is real and you must deal with it
– do what we did or
– build a system that mutates/replaces aggregates already output (eww) or
– delay aggregate output for some slop time; drop it if late data shows up
© 2015 Rocana, Inc. All Rights Reserved.
ex: service event volume by host and minute
22
• dimensions: ts, window, location, host, service, metric
• on write, aggregates: count, sum, min, max, last
• epoch, 60000, us-west-2a, w2a-demo-1, sshd, event_volume =>
17, 42, 1, 10, 8
• on query:
– SELECT floor(ts / 60000) as bin, host, service, metric, sum(value_sum) FROM
events WHERE ts BETWEEN x AND y AND metric = ”event_volume” GROUP BY
bin, host, service, metric
• if late arriving data existed in events, the same dimensions would repeat with a
another set of aggregates and would be rolled up as a result of the group by
• tl;dr: normal window aggregation operations
© 2015 Rocana, Inc. All Rights Reserved.
extension, pain, and advice
© 2015 Rocana, Inc. All Rights Reserved.
extending the system
24
• custom producers
• custom consumers
• event types
• parser / transformation plugins
• custom metric definition and aggregate functions
• custom processing jobs on landed data
© 2015 Rocana, Inc. All Rights Reserved.
pain (aka: the struggle is real)
25
• lots of tradeoffs when picking a stream processing solution
– samza: right features, but low level programming model, not supported by vendors.
missing security features.
– storm: too rigid, too slow. not supported by all Hadoop vendors.
– spark streaming: tons of issues initially, but lots of community energy. improving.
– @digitallogic: “my heart says samza, but my head says spark streaming.”
– our (current) needs are meager; do work inside consumers.
• stack complexity, (relative im)maturity
• scaling solr cloud to billions of events per day
© 2015 Rocana, Inc. All Rights Reserved.
if you’re going to try this…
26
• read all the literature on stream processing[1]
• treat it like the distributed systems problem it is
• understand, make, and make good on guarantees
• find the right abstractions
• never trust the hand waving or “hello worlds”
• fully evaluate the projects/products in this space
• understand it’s not just about search
[1] wait, like all of it? yea, like all of it.
© 2015 Rocana, Inc. All Rights Reserved.
things I didn’t talk about
27
• reprocessing data when bad code / transformations are detected
• dealing with data quality issues (“the struggle is real” part 2)
• the user interface and all the fancy analytics
– data visualization and exploration
– event search
– anomalous trend and event detection
– metric, source, and event correlation
– motif finding
– noise reduction and dithering
• event delivery semantics (e.g. at least once, exactly once, etc.)
• alerting
© 2015 Rocana, Inc. All Rights Reserved.
questions?
thank you.
@esammer | esammer@rocana.com

Más contenido relacionado

La actualidad más candente

Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsAnirvan Chakraborty
 
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processingconfluent
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraRobbie Strickland
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talkDanny Yuan
 
Proofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social MediaProofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social MediaDataStax Academy
 
New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015Robbie Strickland
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flinkRenato Guimaraes
 
British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringDataStax Academy
 
Event Stream Processing with Kafka and Samza
Event Stream Processing with Kafka and SamzaEvent Stream Processing with Kafka and Samza
Event Stream Processing with Kafka and SamzaZach Cox
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductEvans Ye
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayJacob Park
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
 
Perfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataPerfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataAdaryl "Bob" Wakefield, MBA
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Srinath Perera
 

La actualidad más candente (19)

Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analytics
 
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
The State of Stream Processing
The State of Stream ProcessingThe State of Stream Processing
The State of Stream Processing
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
 
Proofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social MediaProofpoint: Fraud Detection and Security on Social Media
Proofpoint: Fraud Detection and Security on Social Media
 
New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
British Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data EngineeringBritish Gas Connected Homes: Data Engineering
British Gas Connected Homes: Data Engineering
 
Event Stream Processing with Kafka and Samza
Event Stream Processing with Kafka and SamzaEvent Stream Processing with Kafka and Samza
Event Stream Processing with Kafka and Samza
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Perfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataPerfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT Data
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
 

Similar a Building a system for machine and event-oriented data - Velocity, Santa Clara 2015

Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Eric Sammer
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
 
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_NETWAYS
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...confluent
 
Kakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appKakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appNeil Avery
 
The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...confluent
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appNeil Avery
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...Amazon Web Services
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
 
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...Zhenzhong Xu
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.Renzo Tomà
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Aljoscha Krettek
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scaleDataScienceConferenc1
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsthelabdude
 

Similar a Building a system for machine and event-oriented data - Velocity, Santa Clara 2015 (20)

Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
 
Kakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appKakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming app
 
The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...The art of the event streaming application: streams, stream processors and sc...
The art of the event streaming application: streams, stream processors and sc...
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming app
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 

Último

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Building a system for machine and event-oriented data - Velocity, Santa Clara 2015

  • 1. building a system for machine and event-oriented data e. sammer | @esammer | may 28, 2015 velocity, santa clara, 2015
  • 2. © 2015 Rocana, Inc. All Rights Reserved. context
  • 3. © 2015 Rocana, Inc. All Rights Reserved. me 3 • i work here: rocana – cto and cofounder • i used to work here: cloudera (‘10 – ’14), magnetic, experian, … • i do this: systems / distributed systems (storage, query, messaging, ...) • i wrote this: if marie (my editor) is in the room, yes, i’m hard at work on the second edition. honest. hi marie.
  • 4. © 2015 Rocana, Inc. All Rights Reserved. what we do 4 • we build a system for the operation of modern data centers • triage and diagnostics, exploration, trends, advanced analytics of complex systems • our data: logs, metrics, human activity, anything that occurs in the data center • “enterprise software” (i.e. we build for others.) • today: how we built what we built coffee! selling speaking helpful venn diagram of today’s events
  • 5. © 2015 Rocana, Inc. All Rights Reserved. our typical customer use cases 5 • ~100K events / sec (8.6B events / day), sub-second end to end latency, full fidelity retention, critical use cases • quality of service - “are credit card transactions happening fast enough?” • fraud detection - “detect, investigate, prosecute, and learn from fraud.” • forensic diagnostics - “what really caused the outage last friday?” • security - “who’s doing what, where, when, why, and how, and is that ok?” • user behavior - ”capture and correlate user behavior with system performance, then feed it to downstream systems in realtime.”
  • 6. © 2015 Rocana, Inc. All Rights Reserved. depth: 3 meters
  • 7. © 2015 Rocana, Inc. All Rights Reserved. high level architecture 7
  • 8. © 2015 Rocana, Inc. All Rights Reserved. guarantees 8 • no single point of failure exists • all components scale horizontally[1] • data retention and latency is a function of cost, not tech[1] • every event is delivered provided no more than N - 1 failures occur (where N is the kafka replication level) • all operations, including upgrade, are online[2] • every event is (or appears to be) delivered exactly once[3] [1] we’re positive there’s a limit, but thus far it has been cost. [2] from the user’s perspective, at a system level. [3] when queried via our UI. lots of details here.
  • 9. © 2015 Rocana, Inc. All Rights Reserved. events
  • 10. © 2015 Rocana, Inc. All Rights Reserved. modeling our world 10 • everything is an event • each event contains a timestamp, type, location, host, service, body, and type- specific attributes (k/v pairs) • build specialized aggregates as necessary - just optimized views of the data
  • 11. © 2015 Rocana, Inc. All Rights Reserved. event schema 11 { ts: long, event_type_id: int, location: string, host: string, service: string, body: [ null, string ], attributes: map<string> }
  • 12. © 2015 Rocana, Inc. All Rights Reserved. event types 12 • some event types are standard – syslog, http, log4j, generic text record, … • users define custom event types • producers populate event type • transformations can turn one event type into another • event type metadata tells downstream systems how to interpret body and attributes
  • 13. © 2015 Rocana, Inc. All Rights Reserved. ex: generic syslog event 13 event_type_id: 100, // rfc3164, rfc5424 (syslog) body: … // raw syslog message bytes attributes: { // extracted fields from body syslog_message: “DHCPACK from 10.10.0.1 (xid=0x45b63bdc)”, syslog_severity: “6”, // info severity syslog_facility: “3”, // daemon facility syslog_process: “dhclient”, syslog_pid: “668”, … }
  • 14. © 2015 Rocana, Inc. All Rights Reserved. ex: generic http event 14 event_type_id: 102, // generic http event body: … // raw http log message bytes attributes: { http_req_method: “GET”, http_req_vhost: “w2a-demo-02”, http_req_path: “/api/v1/search?q=service%3Asshd&p=1&s=200”, http_req_query: “q=service%3Asshd&p=1&s=200”, http_resp_code: “200”, … }
  • 15. © 2015 Rocana, Inc. All Rights Reserved. consumers
  • 16. © 2015 Rocana, Inc. All Rights Reserved. consumers 16 • …do most of the work • parallelism • kafka offset management • message de-duplication • transformation (embedded library) • dead letter queue support • downstream system knowledge
  • 17. © 2015 Rocana, Inc. All Rights Reserved. consumers 17 • …do most of the work • parallelism • kafka offset management • message de-duplication • transformation (embedded library) • dead letter queue support • downstream system knowledge
  • 18. © 2015 Rocana, Inc. All Rights Reserved. inside a consumer 18
  • 19. © 2015 Rocana, Inc. All Rights Reserved. metrics and time series
  • 20. © 2015 Rocana, Inc. All Rights Reserved. aggregation 20 • mostly for time series metrics • two halves: on write and on query • data model: (dimensions) => (aggregates) • on write – reduce(a: A, b: A): B over window – store “base” aggregates, all associative and commutative • on query – perform same aggregate or build non-associative/commutative aggregates – group by the same dimensions – we use SQL (Impala)
  • 21. © 2015 Rocana, Inc. All Rights Reserved. aside: late arriving data (it’s a thing) 21 • never trust a (wall) clock • producer determines observation time, rest of the system uses this always • data that shows up late always processed according to observation time • aggregation consequences – the same time window can appear multiple times – solution: aggregate every N seconds, potentially generating multiple aggregates for the same time bin • this is real and you must deal with it – do what we did or – build a system that mutates/replaces aggregates already output (eww) or – delay aggregate output for some slop time; drop it if late data shows up
  • 22. © 2015 Rocana, Inc. All Rights Reserved. ex: service event volume by host and minute 22 • dimensions: ts, window, location, host, service, metric • on write, aggregates: count, sum, min, max, last • epoch, 60000, us-west-2a, w2a-demo-1, sshd, event_volume => 17, 42, 1, 10, 8 • on query: – SELECT floor(ts / 60000) as bin, host, service, metric, sum(value_sum) FROM events WHERE ts BETWEEN x AND y AND metric = ”event_volume” GROUP BY bin, host, service, metric • if late arriving data existed in events, the same dimensions would repeat with a another set of aggregates and would be rolled up as a result of the group by • tl;dr: normal window aggregation operations
  • 23. © 2015 Rocana, Inc. All Rights Reserved. extension, pain, and advice
  • 24. © 2015 Rocana, Inc. All Rights Reserved. extending the system 24 • custom producers • custom consumers • event types • parser / transformation plugins • custom metric definition and aggregate functions • custom processing jobs on landed data
  • 25. © 2015 Rocana, Inc. All Rights Reserved. pain (aka: the struggle is real) 25 • lots of tradeoffs when picking a stream processing solution – samza: right features, but low level programming model, not supported by vendors. missing security features. – storm: too rigid, too slow. not supported by all Hadoop vendors. – spark streaming: tons of issues initially, but lots of community energy. improving. – @digitallogic: “my heart says samza, but my head says spark streaming.” – our (current) needs are meager; do work inside consumers. • stack complexity, (relative im)maturity • scaling solr cloud to billions of events per day
  • 26. © 2015 Rocana, Inc. All Rights Reserved. if you’re going to try this… 26 • read all the literature on stream processing[1] • treat it like the distributed systems problem it is • understand, make, and make good on guarantees • find the right abstractions • never trust the hand waving or “hello worlds” • fully evaluate the projects/products in this space • understand it’s not just about search [1] wait, like all of it? yea, like all of it.
  • 27. © 2015 Rocana, Inc. All Rights Reserved. things I didn’t talk about 27 • reprocessing data when bad code / transformations are detected • dealing with data quality issues (“the struggle is real” part 2) • the user interface and all the fancy analytics – data visualization and exploration – event search – anomalous trend and event detection – metric, source, and event correlation – motif finding – noise reduction and dithering • event delivery semantics (e.g. at least once, exactly once, etc.) • alerting
  • 28. © 2015 Rocana, Inc. All Rights Reserved. questions? thank you. @esammer | esammer@rocana.com