SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
Dynamic Infrastructure and Container Monitoring with
PROMETHEUS
Georg Öttl
Infracoders Meetup, 2017, Graz
Follow @goettl
● Enterprise Software dev
● Data Science Services
● Dev / DevOps / Ops
● Developer who likes Math
Twitter: @goettl
About me
Follow @goettl
Overview
● Monitoring
● Prometheus by example
● DevOps demo, scaling Gitblit
● Analyze Prometheus metrics like a data scientist
Follow @goettl
Monitoring
Follow @goettl
Why is monitoring a DevOps topic?
● Check functionality / performance
● Analyse behavior
● Insight how software works
● Trend analytics / resources
You build it you run it!
Follow @goettl
Metrics, tracing, logging?
Follow @goettl
Blog Peter Bourgon - Metrics, Tracing and Logging
Well known monitoring tools
● Nagios, Check_Mk
● Opentsb, Graphite
● Influxdb + Kapacitor (Similar to Prometheus)
● Elasticsearch + Logstash + Kibana + ...
● ...
Hard to use in a DevOps stack
Follow @goettl
Rule #1
"Spend more time working on code that analyzes the meaning
of metrics, than code that collects, moves, stores and displays
metrics", Adrian Cockroft
Follow @goettl
Prometheus by example
Follow @goettl
Demo: app scenario scaling Gitblit
Follow @goettl
Demo: exporter / endpoint (Gitblit)
...
# TYPE jvm_memory_pool_bytes_max gauge
jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8
jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0
jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9
jvm_memory_pool_bytes_max{pool="PS Eden Space",} 1.320157184E9
jvm_memory_pool_bytes_max{pool="PS Survivor Space",} 3.670016E7
jvm_memory_pool_bytes_max{pool="PS Old Gen",} 2.793406464E9
# HELP log4j_appender_total Log4j log statements at various log levels
# TYPE log4j_appender_total counter
log4j_appender_total{level="debug",} 0.0
log4j_appender_total{level="warn",} 4.0
log4j_appender_total{level="trace",} 0.0
log4j_appender_total{level="error",} 1034.0
log4j_appender_total{level="fatal",} 0.0
log4j_appender_total{level="info",} 6049.0
...
Follow @goettl
Demo: Prometheus out of the box functionality
● Scrape raw metrics
● Persist metrics
● Navigate data / promql
● Visualisation
Follow @goettl
Demo: Prometheus advanced vis + navigation
● Grafana dashboards
● Navigation with labels
Follow @goettl
Demo: monitoring as part of development
● Monitoring for verification of load tests
● Tests should trigger similar load to production
● DevOps is the best way to get high quality data
● Alertmanager as Assert.that
Follow @goettl
Demo: the admin part of Prometheus
● Prometheus time series database
● Integration to existing monitoring solutions
● How to scale Prometheus
● 11 integrations to container orchestrators (k8s, marathon, dns, ... )
Follow @goettl
Whitebox instrumentation in Java
Follow @goettl
How to do whitebox monitoring so far
● Json / CSV / SQL View, ...
● JMX
● Libraries with hooks push (e.g. datadog, ... )
Follow @goettl
Prometheus client instrumentation, example Gitblit
● Client instrumentation
● Default metrics for Log4j
● Default metrics für JDK
● Custom Metric for git garbage collection, ldap sync
Follow @goettl
Prometheus client Metrics HTTP / Servlet
Gitblit Servlet / Guice WebModule konfigurieren
bind(MetricsServlet.class).in(Scopes.SINGLETON);
serve("/Prometheus").with(MetricsServlet.class);
... that's it ...
Follow @goettl
Prometheus client Metrics JDK
Register default JDK Metrics
DefaultExports.initialize();
... that's it ...
Follow @goettl
Client Metriken Log4j
Instrumen Logger / Log4j
log4j.rootCategory=INFO, S, METRICS
...
log4j.appender.METRICS = io.Prometheus.client.log4j.InstrumentedAppender
log4j.appender.METRICS.Append = false
... that's it ...
Follow @goettl
Custom Metrics
... that's it ...
private final Counter garbageCollectsTotal = Counter.build()
.name("GIT_GARBAGE_COLLECTS_TOTAL")
.help("Number of git garbage collects issued by giblit for a repository")
.register();
...
garbageCollectsTotal.inc();
Follow @goettl
What did we see?
Whitebox monitoring won't work without Developers!
Follow @goettl
Analyze Prometheus Metrics Like a Data Scientist
Follow @goettl
... should I?
Don't use deep learning and datasience when a straight-
forward 15 minute rule-based system does well.
Datascience can help you to detect patterns and facts in your
metrics you can't see.
Follow @goettl
What is already available
● Great architecture to get high quality data
● Numerical data
● Apply mathematical functions on it
● Easy and fast navigable (promql)
● Alert / rule model
● Chart / histogram vis with Grafana
Follow @goettl
When do I start?
Already working alerts / dashboards you want to improve
Follow @goettl
Two ways to get data out of prometheus
● HTTP API (Poll)
● Exploratory data analysis
● REMOTE API (Push)
● Streaming analysis
Follow @goettl
HTTP API - /api/v1/query_range
requests.get(
url = 'http://127.0.0.1:9090/api/v1/query_range',
params = {
'query': 'sum({__name__=~".+"}) by (__name__,instance)',
'start': '1502809554',
'end' : '1502839554',
'step' : '1m'
})
{"data": {..., "resultType": "matrix",
"result": [{
"metric": {"method": "GET",...},
"values": [[1500008340,"3"], ... ]},...]
}}
Follow @goettl
Normalize prometheus datatypes
● Gauges, histograms are ok
● Counters have to be processed
● No repetition in counters. No statistical value in that.
● Use e.g derivative function to convert a counter to a gauge equivalent
Follow @goettl
Example 1
I can predict the latency of http requests
● Can I use the prometheus function predict_linear?
● Are there other predictions possible?
↡↡ R Notebook predict_linear↡↡
Follow @goettl
Histogramme, Monitoring for the long tail
histogram_quantile(0.99,
sum(
rate(
http_request_duration_seconds_bucket{method="GET"}[1m]
)
) by (le))
Follow @goettl
Outliers Detection Algorithms
Follow @goettl
https://github.com/twitter/AnomalyDetection
Demo export from grafana
● Demo API
● Export into csv
Follow @goettl
Thx for having me here at infracoders meetup 2017!
Questions?
Georg Öttl
Twitter Handle: @goettl
Follow @goettl

Más contenido relacionado

La actualidad más candente

[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 

La actualidad más candente (20)

Efficient monitoring and alerting
Efficient monitoring and alertingEfficient monitoring and alerting
Efficient monitoring and alerting
 
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
Prometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome MonitoringPrometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome Monitoring
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
Tracer
TracerTracer
Tracer
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
 
Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
 
So You Want to Write an Exporter
So You Want to Write an ExporterSo You Want to Write an Exporter
So You Want to Write an Exporter
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
Ajax
AjaxAjax
Ajax
 
Monitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafaMonitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafa
 

Similar a Dynamic Infrastructure and Container Monitoring with Prometheus

OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
NETWAYS
 

Similar a Dynamic Infrastructure and Container Monitoring with Prometheus (20)

Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Go Observability (in practice)
Go Observability (in practice)Go Observability (in practice)
Go Observability (in practice)
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
OSMC 2022 | Unifying Observability Weaving Prometheus, Jaeger, and Open Sourc...
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Deploy a full cncf based observability stack in under 5 minutes with tobs
Deploy a full cncf based observability stack in under 5 minutes with tobsDeploy a full cncf based observability stack in under 5 minutes with tobs
Deploy a full cncf based observability stack in under 5 minutes with tobs
 
Prometheus - Utah Software Architecture Meetup - Clint Checketts
Prometheus - Utah Software Architecture Meetup - Clint CheckettsPrometheus - Utah Software Architecture Meetup - Clint Checketts
Prometheus - Utah Software Architecture Meetup - Clint Checketts
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
DevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheusDevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheus
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Monitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusMonitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with Prometheus
 
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
 
Prometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb SolutionPrometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb Solution
 

Último

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
JohnnyPlasten
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Último (20)

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 

Dynamic Infrastructure and Container Monitoring with Prometheus

  • 1. Dynamic Infrastructure and Container Monitoring with PROMETHEUS Georg Öttl Infracoders Meetup, 2017, Graz Follow @goettl
  • 2. ● Enterprise Software dev ● Data Science Services ● Dev / DevOps / Ops ● Developer who likes Math Twitter: @goettl About me Follow @goettl
  • 3. Overview ● Monitoring ● Prometheus by example ● DevOps demo, scaling Gitblit ● Analyze Prometheus metrics like a data scientist Follow @goettl
  • 5. Why is monitoring a DevOps topic? ● Check functionality / performance ● Analyse behavior ● Insight how software works ● Trend analytics / resources You build it you run it! Follow @goettl
  • 6. Metrics, tracing, logging? Follow @goettl Blog Peter Bourgon - Metrics, Tracing and Logging
  • 7. Well known monitoring tools ● Nagios, Check_Mk ● Opentsb, Graphite ● Influxdb + Kapacitor (Similar to Prometheus) ● Elasticsearch + Logstash + Kibana + ... ● ... Hard to use in a DevOps stack Follow @goettl
  • 8. Rule #1 "Spend more time working on code that analyzes the meaning of metrics, than code that collects, moves, stores and displays metrics", Adrian Cockroft Follow @goettl
  • 10. Demo: app scenario scaling Gitblit Follow @goettl
  • 11. Demo: exporter / endpoint (Gitblit) ... # TYPE jvm_memory_pool_bytes_max gauge jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8 jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0 jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9 jvm_memory_pool_bytes_max{pool="PS Eden Space",} 1.320157184E9 jvm_memory_pool_bytes_max{pool="PS Survivor Space",} 3.670016E7 jvm_memory_pool_bytes_max{pool="PS Old Gen",} 2.793406464E9 # HELP log4j_appender_total Log4j log statements at various log levels # TYPE log4j_appender_total counter log4j_appender_total{level="debug",} 0.0 log4j_appender_total{level="warn",} 4.0 log4j_appender_total{level="trace",} 0.0 log4j_appender_total{level="error",} 1034.0 log4j_appender_total{level="fatal",} 0.0 log4j_appender_total{level="info",} 6049.0 ... Follow @goettl
  • 12. Demo: Prometheus out of the box functionality ● Scrape raw metrics ● Persist metrics ● Navigate data / promql ● Visualisation Follow @goettl
  • 13. Demo: Prometheus advanced vis + navigation ● Grafana dashboards ● Navigation with labels Follow @goettl
  • 14. Demo: monitoring as part of development ● Monitoring for verification of load tests ● Tests should trigger similar load to production ● DevOps is the best way to get high quality data ● Alertmanager as Assert.that Follow @goettl
  • 15. Demo: the admin part of Prometheus ● Prometheus time series database ● Integration to existing monitoring solutions ● How to scale Prometheus ● 11 integrations to container orchestrators (k8s, marathon, dns, ... ) Follow @goettl
  • 16. Whitebox instrumentation in Java Follow @goettl
  • 17. How to do whitebox monitoring so far ● Json / CSV / SQL View, ... ● JMX ● Libraries with hooks push (e.g. datadog, ... ) Follow @goettl
  • 18. Prometheus client instrumentation, example Gitblit ● Client instrumentation ● Default metrics for Log4j ● Default metrics für JDK ● Custom Metric for git garbage collection, ldap sync Follow @goettl
  • 19. Prometheus client Metrics HTTP / Servlet Gitblit Servlet / Guice WebModule konfigurieren bind(MetricsServlet.class).in(Scopes.SINGLETON); serve("/Prometheus").with(MetricsServlet.class); ... that's it ... Follow @goettl
  • 20. Prometheus client Metrics JDK Register default JDK Metrics DefaultExports.initialize(); ... that's it ... Follow @goettl
  • 21. Client Metriken Log4j Instrumen Logger / Log4j log4j.rootCategory=INFO, S, METRICS ... log4j.appender.METRICS = io.Prometheus.client.log4j.InstrumentedAppender log4j.appender.METRICS.Append = false ... that's it ... Follow @goettl
  • 22. Custom Metrics ... that's it ... private final Counter garbageCollectsTotal = Counter.build() .name("GIT_GARBAGE_COLLECTS_TOTAL") .help("Number of git garbage collects issued by giblit for a repository") .register(); ... garbageCollectsTotal.inc(); Follow @goettl
  • 23. What did we see? Whitebox monitoring won't work without Developers! Follow @goettl
  • 24. Analyze Prometheus Metrics Like a Data Scientist Follow @goettl
  • 25. ... should I? Don't use deep learning and datasience when a straight- forward 15 minute rule-based system does well. Datascience can help you to detect patterns and facts in your metrics you can't see. Follow @goettl
  • 26. What is already available ● Great architecture to get high quality data ● Numerical data ● Apply mathematical functions on it ● Easy and fast navigable (promql) ● Alert / rule model ● Chart / histogram vis with Grafana Follow @goettl
  • 27. When do I start? Already working alerts / dashboards you want to improve Follow @goettl
  • 28. Two ways to get data out of prometheus ● HTTP API (Poll) ● Exploratory data analysis ● REMOTE API (Push) ● Streaming analysis Follow @goettl
  • 29. HTTP API - /api/v1/query_range requests.get( url = 'http://127.0.0.1:9090/api/v1/query_range', params = { 'query': 'sum({__name__=~".+"}) by (__name__,instance)', 'start': '1502809554', 'end' : '1502839554', 'step' : '1m' }) {"data": {..., "resultType": "matrix", "result": [{ "metric": {"method": "GET",...}, "values": [[1500008340,"3"], ... ]},...] }} Follow @goettl
  • 30. Normalize prometheus datatypes ● Gauges, histograms are ok ● Counters have to be processed ● No repetition in counters. No statistical value in that. ● Use e.g derivative function to convert a counter to a gauge equivalent Follow @goettl
  • 31. Example 1 I can predict the latency of http requests ● Can I use the prometheus function predict_linear? ● Are there other predictions possible? ↡↡ R Notebook predict_linear↡↡ Follow @goettl
  • 32. Histogramme, Monitoring for the long tail histogram_quantile(0.99, sum( rate( http_request_duration_seconds_bucket{method="GET"}[1m] ) ) by (le)) Follow @goettl
  • 33. Outliers Detection Algorithms Follow @goettl https://github.com/twitter/AnomalyDetection
  • 34. Demo export from grafana ● Demo API ● Export into csv Follow @goettl
  • 35. Thx for having me here at infracoders meetup 2017! Questions? Georg Öttl Twitter Handle: @goettl Follow @goettl