SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Once upon a time…
No Good Deed…
adatole
@LeonAdato
The Four Questions
(every monitoring engineer
is asked)
Hello!
▧ Working in IT 30+ years
▧ 20+ years in monitoring
○ CASE $CompanySize
■ <=100
■ >100 && <1000
■ >1000 && <5000
■ >250,000
▧ Currently “Head Geek” at SolarWinds
○ Head Geek <> Developer
○ Head Geek != Marketing
○ Head Geek ≠ Sales
○ “Head Geek” LIKE “%Advocate%”
○ Head Geek == STORYTELLER
Leon Adato
Where To Find Me
Twitter: @LeonAdato
THWACK.com: AdatoLe
WWWeb www.AdatoSystems.com
Podcast TechnicallyReligious.com
5
MONITORING Engineer??
6
adatole
@LeonAdato
The Four Questions of Alerting
Why didn’t I get
an alert?
Why did I get
this alert?
What will alert
on my system?
What’s being
monitored on
my system?
adatole
@LeonAdato
The Jewish Roots of…
Questions
▧ “Du fregst a gutte kashe”
▧ Nobel Lauriat in Physics Dr. Isidor Rabi
adatole
@LeonAdato
“My mother made me a scientist without ever intending to.
Every other mother in Brooklyn would ask her child:
“So? Did you learn anything today?”
But not my mother.
“Izzy,” she would say, “did you ask a good question today?”
That difference — asking good questions — made me
become a scientist.
9
adatole
@LeonAdato
The Jewish Roots of…
Questions
▧ “Du fregst a gutte kashe”
▧ Nobel Lauriat in Physics Dr. Isidor Rabi
▧ THE Four Questions
‫ָּה‬‫נ‬ ַ‫ת‬ ְׁ‫ש‬ִּ‫נ‬ ‫ה‬ ַ‫מ‬,‫ֵּילֹות‬‫ל‬ ַ‫ה‬ ‫ָּל‬‫כ‬ ִּ‫מ‬ ‫ֶּה‬‫ז‬ ַ‫ה‬ ‫ָּה‬‫ל‬ְׁ‫י‬ַ‫ל‬ ַ‫ה‬
Why is this night different from all other nights?
adatole
@LeonAdato
What’s the Teretz?*
▧ We need the same open-ness to questions
▧ Relish the experience of asking, of discovery
▧ We don’t work in tech because
I already know that
▧ We work in tech because we love
I’ll find out
*Teretz = answer
adatole
@LeonAdato

“Your system is down.”
Question #1: Why did I get that alert?
☺
CPU on the Windows device owned
by Accounting named
Mnth_Reporting (IP: 10.2.3.4, DNS:
MonRep.MyCorp.Net) has been over
80% for more than 15 minutes. CPU
at 2:16am EST is 96%.
Device details: http://blahblah.
Acknowledge this alert: http://ackme
This message brought to you by the
alert: CPU_CRIT_PROD and the
polling engine Poller7
12
adatole
@LeonAdato
What’s the Teretz?
▧ Name of the system
▧ Specific component or sub-
element
▧ Current statistic or status
▧ Time the event occurred
▧ Time the alert was sent
▧ Custom fields like location,
owner, etc.
▧ OS type and version
▧ IP address
▧ DNS name or Sysname
▧ The threshold
▧ The duration
▧ A link to the device or
metric
▧ The name of the alert
▧ The polling engine
adatole
@LeonAdato
▧ The story
… before the story
…… before the story
▧ Context matters!
▧ History matters!
▧ Not only “why did I get this alert”
▧ But “why do these alerts exist at
all?”
Jewish Roots:
My Father Was a Wandering Aramean
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ It was designed like that
○ Alert windows
○ Problem duration
○ Ticket not reset
○ Mute/unmanage/shut-up
○ Parent-child
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ Change Un-Control
○ Credential changed
○ Network changed
○ Custom Property
○ Element removed
○ Physical to Virtual
adatole
@LeonAdato
Question #2: Why DIDN’T I Get That Alert?
▧ Monitoring Failed
○ Polling stopped
○ Agent stopped
○ Data throttled
○ Db is out of sync
○ New code/image missing tracing
○ Monitoring “supply chain” failed (email)
○ Event correlation rules
adatole
@LeonAdato
What’s the Teretz?
▧ Understand (and communicate) exceptions
▧ Save your receipts
▧ Save other people’s receipts too, if you can
▧ Monitor your monitoring
▧ Test your notification delivery infrastructure
▧ Have validation steps ready
adatole
@LeonAdato
Question #3: What’s monitored on my system(s)?
adatole
@LeonAdato
Alerting ≠ Monitoring
adatole
@LeonAdato
Question #3: What’s monitored on my system(s)?
adatole
@LeonAdato
What’s the Teretz?
▧ One size fits… some?
▧ Skillcheck: SQL
▧ Skillcheck: wireshark
▧ Look at the screens
adatole
@LeonAdato
Jewish Roots:
Burning Hail and Black Swans
▧ Why do we remember the plagues?
○ Visceral, unexpected, unique
▧ Let’s talk about “black swans”
▧ The plagues as black swan events
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
▧ What *IS* an alert?
○ Emergency
○ Interruption
○ Unplanned Work
▧ What does alerting NEED to be
○ Timely
○ Meaningful
○ Actionable
adatole
@LeonAdato
Question #4: What COULD alert for my
systems?
▧ Why does this matter?
○ # of systems
○ # of alerts that can trigger for those systems
○ # of staff hours to address those alerts
○ # of alerts that could trigger simultaneously
adatole
@LeonAdato
What’s the Teretz?
▧ This can be a VERY difficult question to answer
▧ But it’s difficulty is in proportion to importance
▧ Speaks to potential impact to the company,
workload, interruptions.
adatole
@LeonAdato
What’s the Teretz?
adatole
@LeonAdato
Jewish Roots:
Are You Ready For the Hard Questions?
▧ Scholar, Skeptic, Simple, & Silent
▧ Meet each user where they are
▧ Let’s talk about the Skeptic (“the wicked son”)
▧ Listen past the snark for the question
adatole
@LeonAdato
Question #5: What Do you
Monitor “Standard”?
adatole
@LeonAdato
Wait, I thought you said FOUR questions!
adatole
@LeonAdato
Jewish Roots:
Four or Five cups?
▧ Symbolism of wine as joy
▧ We need to remember to pause for joyful moments
▧ Despite rigorous Talmudic analysis, there are still questions
without clear answers.
▧ BUT… that doesn’t mean we disengage.
▧ We return to these questions over and over, try new
approaches.
A lot like IT problems.
adatole
@LeonAdato
OK, So That Fifth Question:
What Do You Monitor “Standard”?
▧ When you load up a box into monitoring, what do
consumers automatically get?
▧ If you can’t describe this, how will anyone know
what to ask for “extra”?
adatole
@LeonAdato
The Mostly Un-Necessary Summary
Being prepared for the 4 (ok 5) questions
▧ Your monitoring will be (better) prepared for the stresses it
will be exposed to.
▧ You will be (better) prepared as an advocate for monitoring
▧ You’ll spend less time answering repetitive questions and
more time doing to the work of a monitoring engineer.
(i.e.: the GOOD stuff!)
adatole
@LeonAdato
If you still have
questions…
36
adatole
@LeonAdato
Thank You!
I’m READY
Tell me what questions you have
37

Más contenido relacionado

Similar a The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato

Big Data for Social Good
Big Data for Social GoodBig Data for Social Good
Big Data for Social GoodDataLook
 
I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)Ignacio Elola Villar
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTrent McConaghy
 
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics dataDigital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics dataCrossView
 
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data  A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data lokku
 
Technology to Improve Your (Business) Life
Technology to Improve Your (Business) LifeTechnology to Improve Your (Business) Life
Technology to Improve Your (Business) LifeGarry Polmateer
 
Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019Codemotion
 
Eat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data ScientistsEat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data ScientistsWilliam Voorhees
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small PlacesInitial State
 
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...Codemotion
 
Blugsphere2011 admin
Blugsphere2011 adminBlugsphere2011 admin
Blugsphere2011 adminAusLUG
 
Information security awareness training
Information security awareness trainingInformation security awareness training
Information security awareness trainingSandeep Taileng
 
Trouble shooting a computer
Trouble shooting a computerTrouble shooting a computer
Trouble shooting a computerheidirobison
 
Monitoring Is Never Done
Monitoring Is Never DoneMonitoring Is Never Done
Monitoring Is Never DoneMelanie Cey
 
What To Do When It All Goes So Wrong
What To Do When It All Goes So WrongWhat To Do When It All Goes So Wrong
What To Do When It All Goes So WrongDavid Levy
 
Incident Response Fails
Incident Response FailsIncident Response Fails
Incident Response FailsMichael Gough
 

Similar a The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato (20)

Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Big Data for Social Good
Big Data for Social GoodBig Data for Social Good
Big Data for Social Good
 
I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)I believe I can fly (Extract London 2015)
I believe I can fly (Extract London 2015)
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and Nature
 
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics dataDigital Analytics Checkup: How to evaluate the impact of your web analytics data
Digital Analytics Checkup: How to evaluate the impact of your web analytics data
 
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data  A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
 
Technology to Improve Your (Business) Life
Technology to Improve Your (Business) LifeTechnology to Improve Your (Business) Life
Technology to Improve Your (Business) Life
 
Better the devil you know
Better the devil you knowBetter the devil you know
Better the devil you know
 
Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019Jason Yee - Chaos! - Codemotion Rome 2019
Jason Yee - Chaos! - Codemotion Rome 2019
 
Eat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data ScientistsEat Your Vegetables - Data Security for Data Scientists
Eat Your Vegetables - Data Security for Data Scientists
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
 
Big Data from Small Places
Big Data from Small PlacesBig Data from Small Places
Big Data from Small Places
 
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
Christian Heilmann - Building human interfaces powered by AI - Codemotion Ber...
 
Blugsphere2011 admin
Blugsphere2011 adminBlugsphere2011 admin
Blugsphere2011 admin
 
Information security awareness training
Information security awareness trainingInformation security awareness training
Information security awareness training
 
Trouble shooting a computer
Trouble shooting a computerTrouble shooting a computer
Trouble shooting a computer
 
Monitoring Is Never Done
Monitoring Is Never DoneMonitoring Is Never Done
Monitoring Is Never Done
 
What To Do When It All Goes So Wrong
What To Do When It All Goes So WrongWhat To Do When It All Goes So Wrong
What To Do When It All Goes So Wrong
 
Hackers secrets
Hackers secretsHackers secrets
Hackers secrets
 
Incident Response Fails
Incident Response FailsIncident Response Fails
Incident Response Fails
 

Más de Cloud Native Day Tel Aviv

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native Day Tel Aviv
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaCloud Native Day Tel Aviv
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Cloud Native Day Tel Aviv
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomCloud Native Day Tel Aviv
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.Cloud Native Day Tel Aviv
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveCloud Native Day Tel Aviv
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinCloud Native Day Tel Aviv
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.Cloud Native Day Tel Aviv
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native Day Tel Aviv
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.Cloud Native Day Tel Aviv
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud Native Day Tel Aviv
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud Native Day Tel Aviv
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Cloud Native Day Tel Aviv
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...Cloud Native Day Tel Aviv
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Cloud Native Day Tel Aviv
 
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...Cloud Native Day Tel Aviv
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...Cloud Native Day Tel Aviv
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...Cloud Native Day Tel Aviv
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018Cloud Native Day Tel Aviv
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Cloud Native Day Tel Aviv
 

Más de Cloud Native Day Tel Aviv (20)

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef Mann
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor Salceda
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat Cosgrove
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini Reznik
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
 
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
 

Último

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato

  • 1. Once upon a time…
  • 3. The Four Questions (every monitoring engineer is asked)
  • 4. Hello! ▧ Working in IT 30+ years ▧ 20+ years in monitoring ○ CASE $CompanySize ■ <=100 ■ >100 && <1000 ■ >1000 && <5000 ■ >250,000 ▧ Currently “Head Geek” at SolarWinds ○ Head Geek <> Developer ○ Head Geek != Marketing ○ Head Geek ≠ Sales ○ “Head Geek” LIKE “%Advocate%” ○ Head Geek == STORYTELLER Leon Adato
  • 5. Where To Find Me Twitter: @LeonAdato THWACK.com: AdatoLe WWWeb www.AdatoSystems.com Podcast TechnicallyReligious.com 5
  • 7. The Four Questions of Alerting Why didn’t I get an alert? Why did I get this alert? What will alert on my system? What’s being monitored on my system? adatole @LeonAdato
  • 8. The Jewish Roots of… Questions ▧ “Du fregst a gutte kashe” ▧ Nobel Lauriat in Physics Dr. Isidor Rabi adatole @LeonAdato
  • 9. “My mother made me a scientist without ever intending to. Every other mother in Brooklyn would ask her child: “So? Did you learn anything today?” But not my mother. “Izzy,” she would say, “did you ask a good question today?” That difference — asking good questions — made me become a scientist. 9 adatole @LeonAdato
  • 10. The Jewish Roots of… Questions ▧ “Du fregst a gutte kashe” ▧ Nobel Lauriat in Physics Dr. Isidor Rabi ▧ THE Four Questions ‫ָּה‬‫נ‬ ַ‫ת‬ ְׁ‫ש‬ִּ‫נ‬ ‫ה‬ ַ‫מ‬,‫ֵּילֹות‬‫ל‬ ַ‫ה‬ ‫ָּל‬‫כ‬ ִּ‫מ‬ ‫ֶּה‬‫ז‬ ַ‫ה‬ ‫ָּה‬‫ל‬ְׁ‫י‬ַ‫ל‬ ַ‫ה‬ Why is this night different from all other nights? adatole @LeonAdato
  • 11. What’s the Teretz?* ▧ We need the same open-ness to questions ▧ Relish the experience of asking, of discovery ▧ We don’t work in tech because I already know that ▧ We work in tech because we love I’ll find out *Teretz = answer adatole @LeonAdato
  • 12.  “Your system is down.” Question #1: Why did I get that alert? ☺ CPU on the Windows device owned by Accounting named Mnth_Reporting (IP: 10.2.3.4, DNS: MonRep.MyCorp.Net) has been over 80% for more than 15 minutes. CPU at 2:16am EST is 96%. Device details: http://blahblah. Acknowledge this alert: http://ackme This message brought to you by the alert: CPU_CRIT_PROD and the polling engine Poller7 12 adatole @LeonAdato
  • 13. What’s the Teretz? ▧ Name of the system ▧ Specific component or sub- element ▧ Current statistic or status ▧ Time the event occurred ▧ Time the alert was sent ▧ Custom fields like location, owner, etc. ▧ OS type and version ▧ IP address ▧ DNS name or Sysname ▧ The threshold ▧ The duration ▧ A link to the device or metric ▧ The name of the alert ▧ The polling engine adatole @LeonAdato
  • 14. ▧ The story … before the story …… before the story ▧ Context matters! ▧ History matters! ▧ Not only “why did I get this alert” ▧ But “why do these alerts exist at all?” Jewish Roots: My Father Was a Wandering Aramean adatole @LeonAdato
  • 15. Question #2: Why DIDN’T I Get That Alert? adatole @LeonAdato
  • 16. Question #2: Why DIDN’T I Get That Alert? ▧ It was designed like that ○ Alert windows ○ Problem duration ○ Ticket not reset ○ Mute/unmanage/shut-up ○ Parent-child adatole @LeonAdato
  • 17. Question #2: Why DIDN’T I Get That Alert? ▧ Change Un-Control ○ Credential changed ○ Network changed ○ Custom Property ○ Element removed ○ Physical to Virtual adatole @LeonAdato
  • 18. Question #2: Why DIDN’T I Get That Alert? ▧ Monitoring Failed ○ Polling stopped ○ Agent stopped ○ Data throttled ○ Db is out of sync ○ New code/image missing tracing ○ Monitoring “supply chain” failed (email) ○ Event correlation rules adatole @LeonAdato
  • 19. What’s the Teretz? ▧ Understand (and communicate) exceptions ▧ Save your receipts ▧ Save other people’s receipts too, if you can ▧ Monitor your monitoring ▧ Test your notification delivery infrastructure ▧ Have validation steps ready adatole @LeonAdato
  • 20. Question #3: What’s monitored on my system(s)? adatole @LeonAdato
  • 22. Question #3: What’s monitored on my system(s)? adatole @LeonAdato
  • 23. What’s the Teretz? ▧ One size fits… some? ▧ Skillcheck: SQL ▧ Skillcheck: wireshark ▧ Look at the screens adatole @LeonAdato
  • 24. Jewish Roots: Burning Hail and Black Swans ▧ Why do we remember the plagues? ○ Visceral, unexpected, unique ▧ Let’s talk about “black swans” ▧ The plagues as black swan events adatole @LeonAdato
  • 25. Question #4: What COULD alert for my systems? adatole @LeonAdato
  • 26. Question #4: What COULD alert for my systems? ▧ What *IS* an alert? ○ Emergency ○ Interruption ○ Unplanned Work ▧ What does alerting NEED to be ○ Timely ○ Meaningful ○ Actionable adatole @LeonAdato
  • 27. Question #4: What COULD alert for my systems? ▧ Why does this matter? ○ # of systems ○ # of alerts that can trigger for those systems ○ # of staff hours to address those alerts ○ # of alerts that could trigger simultaneously adatole @LeonAdato
  • 28. What’s the Teretz? ▧ This can be a VERY difficult question to answer ▧ But it’s difficulty is in proportion to importance ▧ Speaks to potential impact to the company, workload, interruptions. adatole @LeonAdato
  • 30. Jewish Roots: Are You Ready For the Hard Questions? ▧ Scholar, Skeptic, Simple, & Silent ▧ Meet each user where they are ▧ Let’s talk about the Skeptic (“the wicked son”) ▧ Listen past the snark for the question adatole @LeonAdato
  • 31. Question #5: What Do you Monitor “Standard”? adatole @LeonAdato
  • 32. Wait, I thought you said FOUR questions! adatole @LeonAdato
  • 33. Jewish Roots: Four or Five cups? ▧ Symbolism of wine as joy ▧ We need to remember to pause for joyful moments ▧ Despite rigorous Talmudic analysis, there are still questions without clear answers. ▧ BUT… that doesn’t mean we disengage. ▧ We return to these questions over and over, try new approaches. A lot like IT problems. adatole @LeonAdato
  • 34. OK, So That Fifth Question: What Do You Monitor “Standard”? ▧ When you load up a box into monitoring, what do consumers automatically get? ▧ If you can’t describe this, how will anyone know what to ask for “extra”? adatole @LeonAdato
  • 35. The Mostly Un-Necessary Summary Being prepared for the 4 (ok 5) questions ▧ Your monitoring will be (better) prepared for the stresses it will be exposed to. ▧ You will be (better) prepared as an advocate for monitoring ▧ You’ll spend less time answering repetitive questions and more time doing to the work of a monitoring engineer. (i.e.: the GOOD stuff!) adatole @LeonAdato
  • 36. If you still have questions… 36 adatole @LeonAdato
  • 37. Thank You! I’m READY Tell me what questions you have 37