Monitoring even a modestly-sized systems infrastructure quickly becomes untenable without automated alerting. For many metrics it is nontrivial to define ahead of time what constitutes “normal” versus “abnormal” values. This is especially true for metrics whose baseline value fluctuates over time. To make this problem more tractable, Datadog provides outlier detection functionality to automatically identify any host (or group of hosts) that is behaving abnormally compared to its peers. These slides cover the algorithms we use for outlier detection, and show how easy they are to implement using Python. This presentation also covers the lessons we've learned from using outlier detection on our own systems, along with some real-life examples on how to avoid false positives and negatives. Learn more at www.datadoghq.com.