Observability will not fix your Broken Monitoring ,Ignite

Kris Buytaert @krisbuytaert
Observability, will not ﬁx your broken monitoring
Devopsdays Berlin September , 2022
O11y 1

This new Hype
• Docker, Docker, Docker, Docker, Docker, Docker...
• Kubernetes, Kubernetes, Kubernetes, Kubernetes,
Kubernetes, Kubernetes,
• O11y, o11y, o11y, o11y, o11y, o11y, o11y,
• We’re all doing this, right ?
• This is the new default, right . ?
O11y 2

A real life story :
Large Government agency
• Large CheckMk setup
• Lots of custom checks
• No automation, checks are created manually
• Custom CMDB, which is out of sync with reality
O11y 3

The Unhappy on call (SRE) team
• Happy with their tool, not happy with
• Being left out of the information loop ( e.g a when service
would be decommissioned )
• No known ownership for services
• Management wants “observability”
O11y 4

Their Plan
• Start from scratch
• Move to Prometheus (<- insert shiny new tool here)
• One year eﬀort with a focus on the new technology stacks
(k8s)
• Then migrate the old monitoring
O11y 5

Result
• Old tool is still primary alerting tooling
• Rather than moving forward they added another tool to
manage
• Now they managed 2 stacks
• No real observability ever happened
• 12 months later , the prometheus stack is unmaintained
O11y 6

This is NOT an isolated case
• Encountered multiple similar cases,
• Pattern :
• while true ; do
• This Tool stinks, Lets do this over again and with a new tool.
• We implement exactly the same broken setup but with a
diﬀerent tool
O11y 7

Where is the real observability ?
• Often we have metrics
• But only for a week
• Often we lost our long term metrics
• Often we have logs
• But no derived metrics
• We are only alerting on those metrics
• We are not learning from our metrics
• We’ve regressed
O11y 8

What’s your goal in observability
• We expect performance problems
• We really have performance problems
• We have chaos , better insights in what we run
• Gartner told us .
• We need more Hipster Credits
• We just want Prometheus, Loki and tempo
O11y 9

First Steps
• Fix your monitoring
• Create Single Source of Truth
• No manual Monitoring Conﬁuration (Automation)
• Create clear and Actionable Alerts
• Keep it GREEN
O11y 10

Fix your metrics / logs ...
• Fix your metrics
• I bet you have regression on shipping your Metrics
• I bet you logshipping is partially broken
• I bet you have broken dashboards
O11y 11

Ask
• Who wants Observability ?
• Devs / Management / Ops ?
• What do they really want ?
• Get them in one room
• Ask them what is really hurting them ?
• Where they need help ...
• Listen,
• This sounds trivial .. yet over 10 years of devops and still ...
O11y 12

What is still missing ?
• Probably nothing
• This might be suﬃcient for your use cases.
• Except if it isn’t.
• You might need traces
O11y 13

The Tooling Ecosystem
• Choose an Open source Observability Stack
• Beware of the Fauxpen Source
• Build your automated Observability Infrastructure
• Monitor it
• Pick a Project to start investigating.
• Build dashboards together with your peers.
O11y 14

Will this ﬁt my ecosystem ?
• My proprietary vendor claims it works out of the box.
• But my developers say it doesn’t.
• Trust your devs ;)
O11y 15

Pitfalls of Observability
• You will DDOS Yourselves
• promquery for all MySQL parameters from MySQL exporter
• Flood your disks , kill your LTS
• Trace all the things
• You will DDOS Yourselves
O11y 17

Remember
• You might not need Observability (yet)
• But you DO need to ﬁx your monitoring
• And then you can think about o11y
• But just adopting o11y, will not ﬁx your broken culture.
O11y 18

Kris Buytaert
• I used to be a developer
• Then I became an Ops person
• Chief Trolling/Travel/Technical Oﬃcer @ Inuits.eu
• Chief Yak Shaver @ o11y.eu
• Organiser of #devopsdays, #cfgmgmtcamp, #loadays, ...
• Cofounder of all of the above
• Everything is a Freaking DNS Problem
• DNS : devops needs sushi
• @krisbuytaert on twitter/github
O11y 19

Kris Buytaert @krisbuytaert kris@inuits.eu
o11y, a subbdivision of Inuits
Essensteenweg 31 2930 Brasschaat Belgium
info@o11y.eu
O11y 20

Observability will not fix your Broken Monitoring ,Ignite

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Observability will not fix your Broken Monitoring ,Ignite

Similar to Observability will not fix your Broken Monitoring ,Ignite (20)

More from Kris Buytaert

More from Kris Buytaert (20)

Recently uploaded

Recently uploaded (20)

Observability will not fix your Broken Monitoring ,Ignite