In this talk we’ll go over the new UI and API in InfluxDB 2.0 to create complex monitoring, alerting and notification rules. We’ll start with the easy on-ramp via the user interface and then dig into how the setup and management of monitoring and alerting can be driven through code and the API.
2. Agenda
• Vision
• Building blocks of Monitoring & Alerting
• Classifying your Alerts with Tags
• Leveraging Status and Notification Messages
• Engineering Deep Dive
3. Vision for Monitoring & Alerting in 2.0
• Easy to use interface
• A point-and-click user experience for all!
• Deliver value on top of InfluxDB 2 primitives
• Power users unite!
6. Terminology: Checks
Query
A Flux script that returns time series data
Check
Analyzes the results of a Query to determine the current Status against the
check criteria.
Tags
Flexible user defined Key/Value pairs put on Status
Status
The Level and Tags of a Check written to the Monitoring Bucket
7. Terminology: Checks
Monitoring Bucket
System bucket where a Check stores the current Status
There are two different Check Types
Threshold
Periodically check calculated values against thresholds to determine
Status
Deadman
Periodically check if values are being reported to determine Status
8. Terminology: Notification Endpoints
Configuration describing how to call a 3rd party service
Three different Endpoints are supported in Cloud 2.0 Today
Free Tier
Slack
Paid Tier
HTTP Endpoint
PagerDuty
9. Notification Rule
Notification Rule
Analyzes Monitoring system buckets
When rule conditions are met, sends a Notification Message to the
Notification Endpoint and stores a receipt in the Monitoring Bucket
Records the Notification Endpoint name, Notification Message, Sent
Status, and Tags used in the Check
11. Pulling it all together: A Simple Example
Monitor a system’s CPU
Walk Through: Threshold Check to Notification
• Notify on high CPU
Walk Through: Deadman Check to Notification
• Notify when the system stops reporting
14. Using Custom Tags to Classify Checks
• Separation of team concerns
• Designate responsibility for the monitored resources to a
particular line-of-business, department, or scrum team
• Separation of location concerns
• Location contexts such as LA datacenter or Raleigh datacenter
• Separation of criticality
• Production vs. Staging vs. Development
15. Leveraging Status and Notification
Messages
Flux string interpolation is available within both Status and
Notification messages. Values you can use:
• Custom Tags applied to the Checks
• Values from the Query
• The _check_name
• The _level
• The _source_measurement
• The _type
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
The paid cloud version has three supported endpoint types.
Monitoring Checks call Notification Endpoints via Notification Rules. So, let’s get into each of these.
There is quite a bit of flexibility in those basic building blocks. In this Intermediate section I want to walk you through how you can piece together these three components to give your teams a lot more power and control over how monitoring and alerting is used.
What we don’t want to do is force all our users to create a static one to one relationship between a check and a message that is ultimately sent to someone’s phone.