2. Who am I?
Jeremy Carroll
Sr. Systems Engineer @ .
@jeremy_carroll
3. Why Switch?
Separation of Concerns
● Systems purpose built - Best tool for the job
○ Example: Graphing vs Alerting. Graphite vs
Pagerduty. Why try to do it all?
● Sensu is a router / scheduler
○ Very customizable / extensible
● Nagios was built in an era where Cloud /
Config Management was not as prevalent
○ Shows in design. Tries to do too much.
○ Pull based model for checks.
○ Dynamic discovery is cumbersome and slow
○ 'Object Tricks' / Inheritance is better in Conf Mgmt
4.
5. Configuration Managment
● CM systems push checks / subscriptions
○ All based on role / responsibilities
○ If you are managed as a MySQL instance, then
configure MySQL monitors
● CMDB as a source of truth
○ Infrastructure as code. All systems under CM
○ State enforcement so new systems are monitored
○ Checks variables replaced at runtime.
■ Example: Contact groups, local disks, dynamic
discovery. If you have a FusionIO installed, then
monitor a FusionIO card.
7. Nagios to Sensu Model
● Commands Irrelevant
○ CMDB can change the scripts to fit environment
○ Work with client attributes in check definitions for
customizations
● Service Groups are Subscriptions
○ Checks are decentralized. No need for Host Group
● Contacts are meta-data to Handlers
○ This may change. Contacts are not first class
citizens currently.
● Timeperiod is check metadata
○ subdue as an attribute for begin / end time. Currently
no days of week (TODO).
8. Example
Nagios Sensu
# Host => Host Group {
define host { "client": {
use generic-host "name": "apps-web9",
host_name elastic1-search4 "address": "10.1.7.16",
alias elastic1-search4 "subscriptions": [ "elasticsearch" ],
hostgroups elasticsearch-cluster1 "elasticsearch": {
address 10.1.x.x "url": "_cluster/health"
} }
},
# Host Group Object combing service + host "checks": {
define hostgroup { "elasticsearch_cluster": {
hostgroup_name elasticsearch-cluster1 "notification": "ElasticSearch cluster is unhealthy",
alias elasticsearch-cluster1 "command": "/usr/local/nagios/libexec/check_http_json.rb -u http:
} //:::address::::9200/:::elasticsearch.url::: -e 'status' -r 'green'",
"subscribers": [ "elasticsearch" ],
# Service => Hostgroup "handlers": [ "pagerduty" ],
define service { "occurrences": 3,
use generic-service "interval": 60
hostgroup_name elasticsearch-cluster1 }
servicegroups elasticsearch }
contacts myself }
contact_groups esAlerts
service_description check elasticsearch
check_command check_http_json-string!9200!_cluster/health!status!
green
}
define command {
command_name check_http_json-string
command_line /usr/local/nagios/libexec/check_http_json.rb -u 'http:
//$HOSTNAME$:$ARG1$/$ARG2$' -e '$ARG3$' -r '$ARG4$'
}
9. Sensu is a Router
● Multiple handlers per check
○ Send check result to PagerDuty, metrics to Graphite
● Reduce metrics checks / status checks
○ Reusing Nagios commands
■ Exit Code = Status, Metrics = Graphite
■ Use mutators to parse Nagios Perf Data
● Add additional systems easily
○ Want to use CEPMon / Riemann? Add a handler
○ Handlers handle event object according to code
● Trap / Pull / Push
○ Sensu Server can trigger checks via broadcast
○ Standalone clients can schedule / push to server
○ Local port 3030 can be used as an 'event' trap
10. Summary
● Convert existing Nagios checks to Sensu
○ Most of the work will be in your CM system
○ Scoping variables / adding checks to roles
● This is a very fast moving project
○ New features all the time
○ Fantastic packaging
○ Actually has Unit Tests
https://github.com/sensu/sensu