Mean Time to Resolution (MTTR) is a foundational KPI for most organizations. DevOps and SRE teams are under intense pressure to reduce MTTR when resolving incidents. Often parts of incident response processes are manual, bringing together alerts, runbooks, ad-hoc scripts, and people to form a response.
In this webinar, we will show you how to improve resolution time by configuring InfluxDB notification endpoints to PagerDuty and triggering auto-remediations with Rundeck. Using Rundeck’s automated runbooks, customers have experienced up to 50% reduction in incident response time, greatly improving team productivity and reducing unnecessary outage time.
Streamline Incident Response with InfluxDB, PagerDuty and Rundeck
1. Shape Up
Skills Builder - September 4th, 2020
Confidential
How to Streamline Incident Response with
InfluxDB, PagerDuty and Rundeck
April 20th, 2021
2. Speaker: Craig Hobbs
Craig is a Solution Consultant, SyFy Super Fan,
and Do-gooder at Rundeck.
He has 10 years of experience in system
integrations, environment observability, and
application performance. At Rundeck, he helps
DevOps and IT teams leverage Runbook
Automation and Orchestration to solve complex
automation challenges. #BlackLivesMatter
#FightsForTheUser
Twitter: @chobbs
3. Agenda
1 MTTR Impact on DevOps
2 Shorten Resolution Time
3 Solution Overview
4 Demo
5 Q/A
4. 2021 Prediction for IT Automation:
“Organizations will lower operational costs by 30% by combining
hyper-automation technologies with redesigned operational
processes”
- Gartner’s IT Automation Predictions for 2021
5. MTTR (mean time to resolution)
“Average time it takes to fully resolve a
failure.”
6. While MTTR is a critical metric for DevOps teams on its own, it also encourages
DevOps practices in a variety of ways:
Impact of MTTR on DevOps
● Lower impact of production incidents
● Save time and reduce escalation
● Monitor for problems actively
● Improve velocity, quality and performance
14. Use-Case - “Virtual DevOps”
Status:
1. Customer care teams are an
integral part of any organization.
2. DevOps are often inundated with
manual requests from the customer
service team to assist with
resources to resolve trivial issues.
3. Customer care teams then diagnose
and resolve the issue.
Customer Care DevOps
16. Rundeck Template for InfluxDB
Rundeck template do the following:
● Leverage the Rundeck API for
execution meta-data
● Facilitate secure, portable, and
source-controlled Rundeck job
states.
● Simplify sharing and using pre-built
InfluxDB solutions.
17. Working Together for Shorter Incidents
By combining real-time monitoring from
InfluxDB, faster response organization
from PagerDuty, and automated runbook
orchestration from Rundeck, DevOps
teams can shorten incident time and
reduce errors.
20. How ever you measure resolution time, the one
constant is the need to keep that number down.
21. Runbook Automation
● Enable anyone to have
self-service automation access
to operations tasks that were
only available to subject matter
experts.
● Make existing automation more
secure, auditable, and easier to
run.
23. Rundeck Enterprise
Capabilities
Distributed execution
Orchestration workflows
Error handling
Healthchecks
Webhooks
Scheduling
Guided tours
Secure key storage
Role-based access
SSO support
History and audit trail
Ticket integration
Clustering
HA and failover
Plugin repositories
Use your existing tools and scripts
(Any language or automation tools)
Infrastructure aware
Made for DevOps and Cloud Native ways
of working
Security and compliance friendly
Infrastructure
details and
state
Collect and
Process Output
Authentication
and Roles
Tickets, Work
Status, Approvals
Workflow and
Scheduling
24. Case for Self-Service Automation
How can we reduce the burden on SRE teams
and empower customer care?
Make this information available at the click of a
button! This is where power of self-service
automation comes into picture.
● Self-service automation: Equip customer
care teams with the ability to resolve issue
quickly and allow the SRE team to maintain a
set of standards and practices for accessing
secure internal operations.
25. What Rundeck Enterprise Provides
Capabilities
Distributed execution
Orchestration workflows
Error handling
Healthchecks
Webhooks
Scheduling
Guided tours
Secure key storage
Role-based access
SSO support
History and audit trail
Ticket integration
Clustering
HA and failover
Plugin repositories
Use your existing tools and scripts
(Any language or automation tools)
Infrastructure aware
Made for DevOps and Cloud Native ways
of working
Security and compliance friendly
Infrastructure
details and
state
Collect and
Process Output
Authentication
and Roles
Tickets, Work
Status, Approvals
Workflow and
Scheduling
26. We look forward to bringing together our
community of developers to learn, interact
and share tips and use cases.
10-11 May 2021
Hands-On Flux Training
18-19 May 2021
Virtual Experience
www.influxdays.com/emea-2021-virtual-experience/