SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Monitoring in a SOA World with
Sensu
Outline
● Let’s visit the dark ages
● How Sensu Works
● Special (open source) Yelp + Sensu Sauce
● Mini-Demo
● How PaaSTA Uses Sensu
● Second Demo
The Dark Ages
● One Word: Nagios
● Monitoring for Services: “Also Nagios”
● Probably alerts go to OPS anyway
● Probably just making sure the LB is up
● Very little developer visibility
● Hard to articulate to nagios what you want
An Aside: Map Versus Territory
● Territory: The actual things
in production running right
now
● Map: What your monitoring
system *thinks* is running
right now
Who/What
keeps these in
sync?????
How Sensu Works
Client Server
Check Results
Any Events for me to
handle?
Some Host
RabbitMQ
Clients execute checks
Servers don’t know what checks
exist beforehand, they just
operate on events
How Sensu Works - In Words
● Clients can Schedule and Execute checks,
but just put the results on the queue
● Servers handle results off the queue,
route them to things like email, pagerduty,
JIRA, etc.
● Also API, CLI, check history, silencing,
dashboard, etc.
Special (Open Source) Yelp-Sensu Sauce
● https://github.com/Yelp/sensu_handlers
● “Smart” handlers that respond to Sensu
events based on the event data
● Team is the “primary key” when
determining what to do
Declare Your Teams
sensu_handlers::teams:
dev:
pagerduty_api_key: 1234
pages_irc_channel: 'dev1-pages'
notifications_irc_channel: 'devs'
ops:
pagerduty_api_key: 78923
pages_irc_channel: 'ops-pages'
notifications_irc_channel: 'operations-notifications'
notification_email: 'operations@localhost'
project: OPS
hardware:
# Uses the ops Pagerduty service for page-worthy events,
# but otherwise just jira tickets
pagerduty_api_key: 78923
project: METAL
Mini - Demo
What does it look like when you can
dynamically define checks on Sensu clients in
a team-centric way?
{
"name": "test_alert_for_kwa",
"team": "kwa",
"irc_channels": [],
"notification_email": "kwa@dev.yelpcorp.com",
"ticket": false,
"project": false,
"page": false,
"output": "Test output from send-test-sensu-alert",
"status": 2,
"command": "send-test-sensu-alert",
}
What just happened?
How PaaSTA Uses Sensu
● Take advantage of Sensu’s ability to
receive arbitrary events
● We already know which team owns each
service (started documenting that with the
soa-configs)
● We already know where services are
deployed and what latency zones they are
in
Sensu + PaaSTA Demo
What if your monitoring system knew all
about your services and how they are
supposed to be deployed?
What just happened?
● We “went behind PaaSTA’s back” to simulate a failure
of an AZ
● We got a replication alert because of of the latency
zones didn’t meet our expected replication count. (0
out of 3)
● We decided to “remediate” it by expanding our
latency zone to “region”
● Paasta “Made it so”, and our alert resolved and the
status command reflected the fact that we are
expecting 6 in that one region
How Did Sensu “Know”?
Is this a Problem?
What should I do About it?
How Did Sensu “Know”?
● Sensu doesn’t “Know” anything except for
the “Teams” metadata hash
● PaaSTA checks Haproxy in each latency
zone because it can read the same SOA
configs that SmartStack does!
● PaaSTA “Knows” which team owns each
service because we told it in SOA configs!
● Sensu just processes the event like normal
Conclusion
● Use a monitoring system that can receive
and process arbitrary events for easy
integration (Sensu)
● Keep service metadata in an easy-to-access
place for pieces to integrate easily (SOA
configs)
● Monitor the exact thing you care about
(replication in each latency zone)
Reading Comprehension Question:
(What was the purpose of this talk?)
A. To Describe how cool Sensu is
B. To Make viewers feel inadequate of their own Nagios
installation
C. To tease viewers about Sensu glue that is not open
source yet
D. To Inspire viewers to build their own dynamic
Monitoring based on some of these ideas!
E. Other?
Reading Comprehension Question:
(What was the purpose of this talk?)
A. To Describe how cool Sensu is
B. To Make viewers feel inadequate of their own Nagios
installation
C. To tease viewers about Sensu glue that is not open
source yet
D. To Inspire viewers to build their own dynamic
Monitoring based on some of these ideas!
E. Other?
Tools Used:
● Sensu:
https://sensuapp.org/
● Yelp’s Sensu Handlers: https://github.
com/Yelp/sensu_handlers
● Mesos:
http://mesos.apache.org/
● Marathon:
https://mesosphere.github.io/marathon/
● Smartstack: http://nerds.airbnb.com/smartstack-service-
discovery-cloud/
Questions?

Más contenido relacionado

La actualidad más candente

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearlDavid Tibbs
 
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Andy Sykes
 
Serverspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collideServerspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collidem_richardson
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Toolsm_richardson
 
Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesPhilip Wernersbach
 
Prometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksPrometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksSatoshi Suzuki
 
Puppet Development Workflow
Puppet Development WorkflowPuppet Development Workflow
Puppet Development WorkflowJeffery Smith
 
Superb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with SensuSuperb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with SensuPaul O'Connor
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflowTomas Doran
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteJeff Barrows
 
Armada - the way to ship microservices
Armada - the way to ship microservicesArmada - the way to ship microservices
Armada - the way to ship microservicesGameDesire Company
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installationsNETWAYS
 
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentDe-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentPuppet
 
Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Amirhossein Saberi
 
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)SaltStack
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Codemotion
 
Continuous delivery of Windows micro services in the cloud
Continuous delivery of Windows micro services in the cloud Continuous delivery of Windows micro services in the cloud
Continuous delivery of Windows micro services in the cloud Owain Perry
 
Configuration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsConfiguration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsSaltStack
 
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...Jean Baptiste Favre
 

La actualidad más candente (20)

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
 
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)
 
Serverspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collideServerspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collide
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Tools
 
Grafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and ChallengesGrafana and MySQL - Benefits and Challenges
Grafana and MySQL - Benefits and Challenges
 
Prometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual TalksPrometheus meets Consul -- Consul Casual Talks
Prometheus meets Consul -- Consul Casual Talks
 
Puppet Development Workflow
Puppet Development WorkflowPuppet Development Workflow
Puppet Development Workflow
 
Superb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with SensuSuperb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with Sensu
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphite
 
Armada - the way to ship microservices
Armada - the way to ship microservicesArmada - the way to ship microservices
Armada - the way to ship microservices
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic EnvironmentDe-centralise and Conquer: Masterless Puppet in a Dynamic Environment
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
 
OMD and Check_mk
OMD and Check_mkOMD and Check_mk
OMD and Check_mk
 
Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017
 
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
 
Continuous delivery of Windows micro services in the cloud
Continuous delivery of Windows micro services in the cloud Continuous delivery of Windows micro services in the cloud
Continuous delivery of Windows micro services in the cloud
 
Configuration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsConfiguration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needs
 
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...
Monitoring a billion kilometers of monthly ride sharing at BlaBlaCar - Zabbix...
 

Destacado

An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine Hakka Labs
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraTyler Hobbs
 
Detect all memory leaks with LeakCanary!
 Detect all memory leaks with LeakCanary! Detect all memory leaks with LeakCanary!
Detect all memory leaks with LeakCanary!Pierre-Yves Ricau
 
Devoxx france 2015 influx db
Devoxx france 2015 influx dbDevoxx france 2015 influx db
Devoxx france 2015 influx dbNicolas Muller
 
Toronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELKToronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELKAndrew Trossman
 
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016Shannon Williams
 
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppet
 
Datomic – A Modern Database - StampedeCon 2014
Datomic – A Modern Database - StampedeCon 2014Datomic – A Modern Database - StampedeCon 2014
Datomic – A Modern Database - StampedeCon 2014StampedeCon
 
7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)Steven Francia
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
 
How to name things: the hardest problem in programming
How to name things: the hardest problem in programmingHow to name things: the hardest problem in programming
How to name things: the hardest problem in programmingPeter Hilton
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick StackGianluca Arbezzano
 
Patterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSPatterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSBoyan Dimitrov
 
Building a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformBuilding a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformAmazon Web Services
 

Destacado (15)

An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine An Introduction to Sensu by Bethany Erskine
An Introduction to Sensu by Bethany Erskine
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Detect all memory leaks with LeakCanary!
 Detect all memory leaks with LeakCanary! Detect all memory leaks with LeakCanary!
Detect all memory leaks with LeakCanary!
 
Devoxx france 2015 influx db
Devoxx france 2015 influx dbDevoxx france 2015 influx db
Devoxx france 2015 influx db
 
Toronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELKToronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELK
 
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
 
Evolving the Netflix API
Evolving the Netflix APIEvolving the Netflix API
Evolving the Netflix API
 
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water OperationsPuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
 
Datomic – A Modern Database - StampedeCon 2014
Datomic – A Modern Database - StampedeCon 2014Datomic – A Modern Database - StampedeCon 2014
Datomic – A Modern Database - StampedeCon 2014
 
7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
 
How to name things: the hardest problem in programming
How to name things: the hardest problem in programmingHow to name things: the hardest problem in programming
How to name things: the hardest problem in programming
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Patterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSPatterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWS
 
Building a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring PlatformBuilding a Global Multi-Tenant Monitoring Platform
Building a Global Multi-Tenant Monitoring Platform
 

Similar a How Yelp Uses Sensu to Monitor Services in a SOA World

SiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageSiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageAlvaro Folgado Rueda
 
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebula Project
 
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)John Schneider
 
MySQL Monitoring Shoot Out
MySQL Monitoring Shoot OutMySQL Monitoring Shoot Out
MySQL Monitoring Shoot OutKris Buytaert
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
Monitoring shootout loadays
Monitoring shootout loadaysMonitoring shootout loadays
Monitoring shootout loadaystomdc
 
Keynote: Sensu as a multi-cloud monitoring control plane
Keynote: Sensu as a multi-cloud monitoring control planeKeynote: Sensu as a multi-cloud monitoring control plane
Keynote: Sensu as a multi-cloud monitoring control planeSensu Inc.
 
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...Puppet
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Yan Cui
 
Adventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaAdventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaMarcel Birkner
 
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 Adventures in Observability: How in-house ClickHouse deployment enabled Inst... Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...Altinity Ltd
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Serverless in production, an experience report
Serverless in production, an experience reportServerless in production, an experience report
Serverless in production, an experience reportYan Cui
 
From SLO to GOTY
From SLO to GOTYFrom SLO to GOTY
From SLO to GOTYScyllaDB
 
Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Yan Cui
 
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfPrometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfKnoldus Inc.
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 

Similar a How Yelp Uses Sensu to Monitor Services in a SOA World (20)

sensu
sensusensu
sensu
 
SiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageSiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team Village
 
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
 
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)
DevOps at Obama for America(2012) and the DNC (DevOps Days NYC Jan 2013)
 
MySQL Monitoring Shoot Out
MySQL Monitoring Shoot OutMySQL Monitoring Shoot Out
MySQL Monitoring Shoot Out
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Monitoring shootout loadays
Monitoring shootout loadaysMonitoring shootout loadays
Monitoring shootout loadays
 
Keynote: Sensu as a multi-cloud monitoring control plane
Keynote: Sensu as a multi-cloud monitoring control planeKeynote: Sensu as a multi-cloud monitoring control plane
Keynote: Sensu as a multi-cloud monitoring control plane
 
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)
 
Adventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaAdventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and Instana
 
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 Adventures in Observability: How in-house ClickHouse deployment enabled Inst... Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Serverless in production, an experience report
Serverless in production, an experience reportServerless in production, an experience report
Serverless in production, an experience report
 
From SLO to GOTY
From SLO to GOTYFrom SLO to GOTY
From SLO to GOTY
 
Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)
 
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfPrometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 

Último

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

How Yelp Uses Sensu to Monitor Services in a SOA World

  • 1. Monitoring in a SOA World with Sensu
  • 2. Outline ● Let’s visit the dark ages ● How Sensu Works ● Special (open source) Yelp + Sensu Sauce ● Mini-Demo ● How PaaSTA Uses Sensu ● Second Demo
  • 3. The Dark Ages ● One Word: Nagios ● Monitoring for Services: “Also Nagios” ● Probably alerts go to OPS anyway ● Probably just making sure the LB is up ● Very little developer visibility ● Hard to articulate to nagios what you want
  • 4. An Aside: Map Versus Territory ● Territory: The actual things in production running right now ● Map: What your monitoring system *thinks* is running right now Who/What keeps these in sync?????
  • 5. How Sensu Works Client Server Check Results Any Events for me to handle? Some Host RabbitMQ Clients execute checks Servers don’t know what checks exist beforehand, they just operate on events
  • 6. How Sensu Works - In Words ● Clients can Schedule and Execute checks, but just put the results on the queue ● Servers handle results off the queue, route them to things like email, pagerduty, JIRA, etc. ● Also API, CLI, check history, silencing, dashboard, etc.
  • 7. Special (Open Source) Yelp-Sensu Sauce ● https://github.com/Yelp/sensu_handlers ● “Smart” handlers that respond to Sensu events based on the event data ● Team is the “primary key” when determining what to do
  • 8. Declare Your Teams sensu_handlers::teams: dev: pagerduty_api_key: 1234 pages_irc_channel: 'dev1-pages' notifications_irc_channel: 'devs' ops: pagerduty_api_key: 78923 pages_irc_channel: 'ops-pages' notifications_irc_channel: 'operations-notifications' notification_email: 'operations@localhost' project: OPS hardware: # Uses the ops Pagerduty service for page-worthy events, # but otherwise just jira tickets pagerduty_api_key: 78923 project: METAL
  • 9. Mini - Demo What does it look like when you can dynamically define checks on Sensu clients in a team-centric way?
  • 10.
  • 11. { "name": "test_alert_for_kwa", "team": "kwa", "irc_channels": [], "notification_email": "kwa@dev.yelpcorp.com", "ticket": false, "project": false, "page": false, "output": "Test output from send-test-sensu-alert", "status": 2, "command": "send-test-sensu-alert", } What just happened?
  • 12. How PaaSTA Uses Sensu ● Take advantage of Sensu’s ability to receive arbitrary events ● We already know which team owns each service (started documenting that with the soa-configs) ● We already know where services are deployed and what latency zones they are in
  • 13. Sensu + PaaSTA Demo What if your monitoring system knew all about your services and how they are supposed to be deployed?
  • 14.
  • 15. What just happened? ● We “went behind PaaSTA’s back” to simulate a failure of an AZ ● We got a replication alert because of of the latency zones didn’t meet our expected replication count. (0 out of 3) ● We decided to “remediate” it by expanding our latency zone to “region” ● Paasta “Made it so”, and our alert resolved and the status command reflected the fact that we are expecting 6 in that one region
  • 16. How Did Sensu “Know”? Is this a Problem? What should I do About it?
  • 17. How Did Sensu “Know”? ● Sensu doesn’t “Know” anything except for the “Teams” metadata hash ● PaaSTA checks Haproxy in each latency zone because it can read the same SOA configs that SmartStack does! ● PaaSTA “Knows” which team owns each service because we told it in SOA configs! ● Sensu just processes the event like normal
  • 18. Conclusion ● Use a monitoring system that can receive and process arbitrary events for easy integration (Sensu) ● Keep service metadata in an easy-to-access place for pieces to integrate easily (SOA configs) ● Monitor the exact thing you care about (replication in each latency zone)
  • 19. Reading Comprehension Question: (What was the purpose of this talk?) A. To Describe how cool Sensu is B. To Make viewers feel inadequate of their own Nagios installation C. To tease viewers about Sensu glue that is not open source yet D. To Inspire viewers to build their own dynamic Monitoring based on some of these ideas! E. Other?
  • 20. Reading Comprehension Question: (What was the purpose of this talk?) A. To Describe how cool Sensu is B. To Make viewers feel inadequate of their own Nagios installation C. To tease viewers about Sensu glue that is not open source yet D. To Inspire viewers to build their own dynamic Monitoring based on some of these ideas! E. Other?
  • 21. Tools Used: ● Sensu: https://sensuapp.org/ ● Yelp’s Sensu Handlers: https://github. com/Yelp/sensu_handlers ● Mesos: http://mesos.apache.org/ ● Marathon: https://mesosphere.github.io/marathon/ ● Smartstack: http://nerds.airbnb.com/smartstack-service- discovery-cloud/