SlideShare a Scribd company logo
1 of 18
Download to read offline
Nov, 2015
Review of Mystery
Machine
Ivan Glushkov
ivan.glushkov@gmail.com
@gliush
Why
❖ Need to debug and optimize applications
❖ Complex, heterogenous systems
❖ Different parts written in different languages
❖ Different communicative channels
❖ Different execution environments
❖ Even if individual components are optimized - the
whole system might not work optimally
What
❖ They develop performance analysis tools
❖ They apply it to their pipeline
❖ They measure end-to-end performance:
❖ from the point of initiating a page load
❖ to the point when browser finishes rendering
Why not
❖ All current approaches assume you instrument your
code, specify relations, etc
❖ Usually you don’t have time or ability
❖ Large systems are developed by large teams
❖ Adding instrumentation retroactively is a Herculean task
Overview
❖ They generate a model via large scale reasoning of logs
❖ They can confirm relationships
❖ They need only (requestId, hostId, hostTS, eventId) in each
log message
❖ UberTrace gathers all the log to one point
❖ MysteryMachine conducts causality model from that traces
❖ MysteryMachine performs analyses: identifying critical
paths, slack analysis, outlier detection
UberTrace: why
❖ No tools to analyze inter-process optimality
❖ They need to have a single end-to-end performance
tracing tool for all logs
UberTrace: requirements
❖ Each log message should contain
❖ Unique request id
❖ Computer id (server node / client laptop)
❖ Timestamp (local clock)
❖ Event name (e.g. “start DOM arendering”)
❖ Task name (<Event,Task> should be unique)
❖ Propagate decision about logging particular request
UberTrace
❖ TS are from local clocks -> translated to global clock
❖ Execution time = Latest TS - Earliest TS
❖ RTT = Es - Ec
❖ Clock skew = 1/2 RTT
❖ Multiple observation,

choose minimal one
Mystery Machine: casual model
❖ Split all logs into segments

(two consecutive events

for the same task)
❖ Create a casual model
❖ They validated this model

for client-side js library

(42 and 84 segments -> 2583 and 10458 casual
relationships)
Mystery Machine: casual model
Mystery Machine: casual model
Mystery Machine: casual model
Mystery Machine: critical path
Critical path - set of segments for which a differential increase in segments execution time
would result in the same differential increase in the end-to-end latency
Mystery Machine: critical path
Mystery Machine: slack
Slack - the amount by which the duration of a segment may increase
without increasing the end-to-end latency of the request
Mystery Machine: slack validation
Mystery Machine: slack analyses usage
Links
❖ Video: https://www.usenix.org/node/186168
❖ Slides: https://www.usenix.org/sites/default/files/
conference/protected-files/osdi14_slides_chow.pdf
❖ Paper: https://www.usenix.org/system/files/
conference/osdi14/osdi14-paper-chow.pdf

More Related Content

What's hot

Elastic Data Processing with Apache Flink and Apache Pulsar
Elastic Data Processing with Apache Flink and Apache PulsarElastic Data Processing with Apache Flink and Apache Pulsar
Elastic Data Processing with Apache Flink and Apache Pulsar
StreamNative
 

What's hot (20)

Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumption
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Nomad Multi-Cloud
Nomad Multi-CloudNomad Multi-Cloud
Nomad Multi-Cloud
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
 
Geode - Day 2
Geode - Day 2Geode - Day 2
Geode - Day 2
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
Cassandra: An Alien Technology That's not so Alien
Cassandra: An Alien Technology That's not so AlienCassandra: An Alien Technology That's not so Alien
Cassandra: An Alien Technology That's not so Alien
 
Elastic Data Processing with Apache Flink and Apache Pulsar
Elastic Data Processing with Apache Flink and Apache PulsarElastic Data Processing with Apache Flink and Apache Pulsar
Elastic Data Processing with Apache Flink and Apache Pulsar
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and Processing
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
 
Bookie storage - Apache BookKeeper Meetup - 2015-06-28
Bookie storage - Apache BookKeeper Meetup - 2015-06-28 Bookie storage - Apache BookKeeper Meetup - 2015-06-28
Bookie storage - Apache BookKeeper Meetup - 2015-06-28
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 

Similar to Mystery Machine Overview

Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
Gabriele Modena
 
Chapter -2 operating system presentation
Chapter -2 operating system presentationChapter -2 operating system presentation
Chapter -2 operating system presentation
chnrketan
 
20031109 WRUG Presentation
20031109 WRUG Presentation20031109 WRUG Presentation
20031109 WRUG Presentation
Manuel Sardinha
 

Similar to Mystery Machine Overview (20)

Scalability truths and serverless architectures
Scalability truths and serverless architecturesScalability truths and serverless architectures
Scalability truths and serverless architectures
 
Event driven systems
Event driven systems Event driven systems
Event driven systems
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
 
Distributed fun with etcd
Distributed fun with etcdDistributed fun with etcd
Distributed fun with etcd
 
Empowering Real-Time Decision Making with Data Streaming
Empowering Real-Time Decision Making with Data StreamingEmpowering Real-Time Decision Making with Data Streaming
Empowering Real-Time Decision Making with Data Streaming
 
Apache flink
Apache flinkApache flink
Apache flink
 
Moving Towards a Streaming Architecture
Moving Towards a Streaming ArchitectureMoving Towards a Streaming Architecture
Moving Towards a Streaming Architecture
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEvents
 
Chapter -2 operating system presentation
Chapter -2 operating system presentationChapter -2 operating system presentation
Chapter -2 operating system presentation
 
Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixir
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
Distributed computing
Distributed  computingDistributed  computing
Distributed computing
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
 
Thread
ThreadThread
Thread
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
 
RTOS - Real Time Operating Systems
RTOS - Real Time Operating SystemsRTOS - Real Time Operating Systems
RTOS - Real Time Operating Systems
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
 
Experiences with Microservices at Tuenti
Experiences with Microservices at TuentiExperiences with Microservices at Tuenti
Experiences with Microservices at Tuenti
 
20031109 WRUG Presentation
20031109 WRUG Presentation20031109 WRUG Presentation
20031109 WRUG Presentation
 
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streaming
 

More from Ivan Glushkov

More from Ivan Glushkov (6)

Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rus
 
Raft in details
Raft in detailsRaft in details
Raft in details
 
Hashicorp Nomad
Hashicorp NomadHashicorp Nomad
Hashicorp Nomad
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow Intro
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
 
fp intro
fp introfp intro
fp intro
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Recently uploaded (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 

Mystery Machine Overview

  • 1. Nov, 2015 Review of Mystery Machine Ivan Glushkov ivan.glushkov@gmail.com @gliush
  • 2. Why ❖ Need to debug and optimize applications ❖ Complex, heterogenous systems ❖ Different parts written in different languages ❖ Different communicative channels ❖ Different execution environments ❖ Even if individual components are optimized - the whole system might not work optimally
  • 3. What ❖ They develop performance analysis tools ❖ They apply it to their pipeline ❖ They measure end-to-end performance: ❖ from the point of initiating a page load ❖ to the point when browser finishes rendering
  • 4. Why not ❖ All current approaches assume you instrument your code, specify relations, etc ❖ Usually you don’t have time or ability ❖ Large systems are developed by large teams ❖ Adding instrumentation retroactively is a Herculean task
  • 5. Overview ❖ They generate a model via large scale reasoning of logs ❖ They can confirm relationships ❖ They need only (requestId, hostId, hostTS, eventId) in each log message ❖ UberTrace gathers all the log to one point ❖ MysteryMachine conducts causality model from that traces ❖ MysteryMachine performs analyses: identifying critical paths, slack analysis, outlier detection
  • 6. UberTrace: why ❖ No tools to analyze inter-process optimality ❖ They need to have a single end-to-end performance tracing tool for all logs
  • 7. UberTrace: requirements ❖ Each log message should contain ❖ Unique request id ❖ Computer id (server node / client laptop) ❖ Timestamp (local clock) ❖ Event name (e.g. “start DOM arendering”) ❖ Task name (<Event,Task> should be unique) ❖ Propagate decision about logging particular request
  • 8. UberTrace ❖ TS are from local clocks -> translated to global clock ❖ Execution time = Latest TS - Earliest TS ❖ RTT = Es - Ec ❖ Clock skew = 1/2 RTT ❖ Multiple observation,
 choose minimal one
  • 9. Mystery Machine: casual model ❖ Split all logs into segments
 (two consecutive events
 for the same task) ❖ Create a casual model ❖ They validated this model
 for client-side js library
 (42 and 84 segments -> 2583 and 10458 casual relationships)
  • 13. Mystery Machine: critical path Critical path - set of segments for which a differential increase in segments execution time would result in the same differential increase in the end-to-end latency
  • 15. Mystery Machine: slack Slack - the amount by which the duration of a segment may increase without increasing the end-to-end latency of the request
  • 17. Mystery Machine: slack analyses usage
  • 18. Links ❖ Video: https://www.usenix.org/node/186168 ❖ Slides: https://www.usenix.org/sites/default/files/ conference/protected-files/osdi14_slides_chow.pdf ❖ Paper: https://www.usenix.org/system/files/ conference/osdi14/osdi14-paper-chow.pdf