SlideShare una empresa de Scribd logo
1 de 16
Descargar para leer sin conexión
Monitoring Infrastructure with
Prometheus
@
- Mohd Sahnawaz, Sr. SRE
Singapore Kubernetes User Group, 4th July 2018
Currently…
➔ 600+ servers
➔ External load balancers see avg 5k+ requests per second
➔ Internal Amplification of 8x to 12x
➔ Self managed deployments:
◆ ElasticSearch (Dynamic Scaling)
◆ PostgresQL
◆ Cassandra
◆ Kafka
◆ Redis
◆ RabbitMQ
◆ And more…
➔ Uptime of 99.95
➔ Ability to handle AZ failures
Architecture
monitoring
Dashboard
Grafana
Data Store
Prometheus
Metrics Source
● Exporter
○ Node
○ Postgres
○ JVM
○ ElasticSearch
○ HAProxy
○ Hystrix
○ StatsD
○ …
● Write your own
Dashboard : Grafana
Dashboard : Grafana (cont…)
K8S Deployment view →
Hystrix Metric view →
Data source: Prometheus
➔ GCE SD configurations to populate hosts
➔ Instance metadata to cluster nodes.
Data source: Prometheus (cont… )
➔ Multiple Instances with different retention
➔ Separate dedicated instances for APM, Node Metrics, ICMP, Kubernetes
➔ Grafana connects to all of these
➔ GCE SD configurations to populate hosts
metrics source: GCE with exporters
scrape_configs:
- job_name: node
scrape_interval: 15s
scrape_timeout: 15s
gce_sd_configs:
- project: <project_name>
zone: <zone_name>
port: 9100
filter: "(name ne .*stage.*)(name ne .*test.*)"
relabel_configs:
- source_labels: [__meta_gce_instance_name]
target_label: host
- source_labels: [__meta_gce_zone]
separator: '/'
regex: '(.*)/(.*)'
replacement: '${2}'
target_label: zone
- source_labels: [__meta_gce_metadata_cluster, __meta_gce_metadata_cluster_name]
separator: ';'
regex: '(.*);(.*)'
replacement: '${1}${2}'
target_label: cluster
- source_labels: [__meta_gce_metadata_env]
target_label: env
- source_labels: [__meta_gce_metadata_component]
target_label: component
Target Labels
➔ K8S SD configurations to populate pods
metrics source: K8S API
- job_name: 'pod-metrics'
scrape_interval: 15s
scrape_timeout: 5s
kubernetes_sd_configs:
- api_server: '<api-path>'
role: node
basic_auth:
username: <user>
password: <access_token>
tls_config:
insecure_skip_verify: true
- api_server: '<api-path>'
role: pod
basic_auth:
username: <user>
password: <access_token>
tls_config:
insecure_skip_verify: true
scheme: http
relabel_configs:
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: host
- source_labels: [__address__]
action: keep
- source_labels: [__address__]
action: replace
target_label: address
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: kubernetes_container
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: cluster
- source_labels: [__meta_kubernetes_pod_container_port_number]
regex: 9284
action: keep
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node_name
Target Labels
➔ Hystrix is great for real time monitoring.
➔ Helps in quickly identifying failures.
➔ We capture hystrix data to prometheus.
➔ Help in debugging/retrospectives
Metrics source: Hystrix
app statsd-exporter prometheus
turbine
➔ Custom metrics with Hystrix
Metrics source: Hystrix
scrape_configs:
- job_name: 'hystrix-stats'
scrape_interval: '15s'
file_sd_configs:
- files:
- 'hystrix-stats*.yml'
refresh_interval: 5m
metric_relabel_configs:
- source_labels: [__name__]
regex: '^(.*)_hystrix.*'
target_label: hcluster
replacement: '${1}'
- source_labels: [__name__]
regex: '^.*_(hystrix.*)'
target_label: __name__
replacement: '${1}'
- source_labels: [__name__]
regex: '^hystrix_([^_]+)_.*'
target_label: command
replacement: '${1}'
- source_labels: [__name__]
regex: '^hystrix_[^_]+_(.*)'
target_label: __name__
replacement: 'hystrix_${1}'
➔ Official client https://github.com/prometheus/client_golang
➔ Support 4 metric types (counter, gauge, histogram, summary)
➔ Built-in support Common gRPC metrics
➔ Exposed in http://ip:port/metrics
Metrics source: Custom
Alerting : Alertmanager
Prometheus alertmanager
slack
victorops
Alerting process →
Alert rule →
Thank you!
Q & A
We’re hiring, visit careers.carousell.com

Más contenido relacionado

La actualidad más candente

Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultBram Vogelaar
 
ReCertifying Active Directory
ReCertifying Active DirectoryReCertifying Active Directory
ReCertifying Active DirectoryWill Schroeder
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaSyah Dwi Prihatmoko
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?Wojciech Barczyński
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructuredAmi Mahloof
 
Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMartin Etmajer
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy Docker, Inc.
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language Weaveworks
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the DataHao Chen
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With PrometheusKnoldus Inc.
 
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...NETWAYS
 
Everything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in KubernetesEverything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in KubernetesThe {code} Team
 
PostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consulPostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consulSean Chittenden
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)Brian Brazil
 
Kubernetes - Security Journey
Kubernetes - Security JourneyKubernetes - Security Journey
Kubernetes - Security JourneyJerry Jalava
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheusKasper Nissen
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 

La actualidad más candente (20)

Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp Vault
 
ReCertifying Active Directory
ReCertifying Active DirectoryReCertifying Active Directory
ReCertifying Active Directory
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and Grafana
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructured
 
Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on Kubernetes
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the Data
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
 
Everything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in KubernetesEverything You Need To Know About Persistent Storage in Kubernetes
Everything You Need To Know About Persistent Storage in Kubernetes
 
PostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consulPostgreSQL High-Availability and Geographic Locality using consul
PostgreSQL High-Availability and Geographic Locality using consul
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
Kubernetes - Security Journey
Kubernetes - Security JourneyKubernetes - Security Journey
Kubernetes - Security Journey
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheus
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 

Similar a Monitoring infrastructure with prometheus

UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...Ivanti
 
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell ScruggsOrchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell ScruggsCloud Native Day Tel Aviv
 
Facebook的缓存系统
Facebook的缓存系统Facebook的缓存系统
Facebook的缓存系统yiditushe
 
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security DevOpsDays Riga
 
4069180 Caching Performance Lessons From Facebook
4069180 Caching Performance Lessons From Facebook4069180 Caching Performance Lessons From Facebook
4069180 Caching Performance Lessons From Facebookguoqing75
 
Deploying windows containers with kubernetes
Deploying windows containers with kubernetesDeploying windows containers with kubernetes
Deploying windows containers with kubernetesBen Hall
 
Solving anything in VCL
Solving anything in VCLSolving anything in VCL
Solving anything in VCLFastly
 
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...KubeAcademy
 
Time series denver an introduction to prometheus
Time series denver   an introduction to prometheusTime series denver   an introduction to prometheus
Time series denver an introduction to prometheusBob Cotton
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINXKevin Jones
 
Extending kubernetes
Extending kubernetesExtending kubernetes
Extending kubernetesGigi Sayfan
 
KubeCon Prometheus Salon -- Kubernetes metrics deep dive
KubeCon Prometheus Salon -- Kubernetes metrics deep diveKubeCon Prometheus Salon -- Kubernetes metrics deep dive
KubeCon Prometheus Salon -- Kubernetes metrics deep diveBob Cotton
 
使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster 使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster inwin stack
 
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenOSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenNETWAYS
 
Achieving compliance With MongoDB Security
Achieving compliance With MongoDB Security Achieving compliance With MongoDB Security
Achieving compliance With MongoDB Security Mydbops
 
Spring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloudSpring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloudOrkhan Gasimov
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackChiradeep Vittal
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Deploying on Kubernetes - An intro
Deploying on Kubernetes - An introDeploying on Kubernetes - An intro
Deploying on Kubernetes - An introAndré Cruz
 
Using Apache as an Application Server
Using Apache as an Application ServerUsing Apache as an Application Server
Using Apache as an Application ServerPhil Windley
 

Similar a Monitoring infrastructure with prometheus (20)

UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
 
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell ScruggsOrchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
 
Facebook的缓存系统
Facebook的缓存系统Facebook的缓存系统
Facebook的缓存系统
 
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security
DevOpsDaysRiga 2018: Andrew Martin - Continuous Kubernetes Security
 
4069180 Caching Performance Lessons From Facebook
4069180 Caching Performance Lessons From Facebook4069180 Caching Performance Lessons From Facebook
4069180 Caching Performance Lessons From Facebook
 
Deploying windows containers with kubernetes
Deploying windows containers with kubernetesDeploying windows containers with kubernetes
Deploying windows containers with kubernetes
 
Solving anything in VCL
Solving anything in VCLSolving anything in VCL
Solving anything in VCL
 
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...
KubeCon EU 2016: Templatized Application Configuration on OpenShift and Kuber...
 
Time series denver an introduction to prometheus
Time series denver   an introduction to prometheusTime series denver   an introduction to prometheus
Time series denver an introduction to prometheus
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINX
 
Extending kubernetes
Extending kubernetesExtending kubernetes
Extending kubernetes
 
KubeCon Prometheus Salon -- Kubernetes metrics deep dive
KubeCon Prometheus Salon -- Kubernetes metrics deep diveKubeCon Prometheus Salon -- Kubernetes metrics deep dive
KubeCon Prometheus Salon -- Kubernetes metrics deep dive
 
使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster 使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster
 
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenOSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
 
Achieving compliance With MongoDB Security
Achieving compliance With MongoDB Security Achieving compliance With MongoDB Security
Achieving compliance With MongoDB Security
 
Spring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloudSpring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloud
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStack
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Deploying on Kubernetes - An intro
Deploying on Kubernetes - An introDeploying on Kubernetes - An intro
Deploying on Kubernetes - An intro
 
Using Apache as an Application Server
Using Apache as an Application ServerUsing Apache as an Application Server
Using Apache as an Application Server
 

Último

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 

Último (20)

Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 

Monitoring infrastructure with prometheus

  • 1. Monitoring Infrastructure with Prometheus @ - Mohd Sahnawaz, Sr. SRE Singapore Kubernetes User Group, 4th July 2018
  • 2. Currently… ➔ 600+ servers ➔ External load balancers see avg 5k+ requests per second ➔ Internal Amplification of 8x to 12x ➔ Self managed deployments: ◆ ElasticSearch (Dynamic Scaling) ◆ PostgresQL ◆ Cassandra ◆ Kafka ◆ Redis ◆ RabbitMQ ◆ And more… ➔ Uptime of 99.95 ➔ Ability to handle AZ failures
  • 4.
  • 5. monitoring Dashboard Grafana Data Store Prometheus Metrics Source ● Exporter ○ Node ○ Postgres ○ JVM ○ ElasticSearch ○ HAProxy ○ Hystrix ○ StatsD ○ … ● Write your own
  • 7. Dashboard : Grafana (cont…) K8S Deployment view → Hystrix Metric view →
  • 9. ➔ GCE SD configurations to populate hosts ➔ Instance metadata to cluster nodes. Data source: Prometheus (cont… ) ➔ Multiple Instances with different retention ➔ Separate dedicated instances for APM, Node Metrics, ICMP, Kubernetes ➔ Grafana connects to all of these
  • 10. ➔ GCE SD configurations to populate hosts metrics source: GCE with exporters scrape_configs: - job_name: node scrape_interval: 15s scrape_timeout: 15s gce_sd_configs: - project: <project_name> zone: <zone_name> port: 9100 filter: "(name ne .*stage.*)(name ne .*test.*)" relabel_configs: - source_labels: [__meta_gce_instance_name] target_label: host - source_labels: [__meta_gce_zone] separator: '/' regex: '(.*)/(.*)' replacement: '${2}' target_label: zone - source_labels: [__meta_gce_metadata_cluster, __meta_gce_metadata_cluster_name] separator: ';' regex: '(.*);(.*)' replacement: '${1}${2}' target_label: cluster - source_labels: [__meta_gce_metadata_env] target_label: env - source_labels: [__meta_gce_metadata_component] target_label: component Target Labels
  • 11. ➔ K8S SD configurations to populate pods metrics source: K8S API - job_name: 'pod-metrics' scrape_interval: 15s scrape_timeout: 5s kubernetes_sd_configs: - api_server: '<api-path>' role: node basic_auth: username: <user> password: <access_token> tls_config: insecure_skip_verify: true - api_server: '<api-path>' role: pod basic_auth: username: <user> password: <access_token> tls_config: insecure_skip_verify: true scheme: http relabel_configs: - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: host - source_labels: [__address__] action: keep - source_labels: [__address__] action: replace target_label: address - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_container_name] action: replace target_label: kubernetes_container - source_labels: [__meta_kubernetes_pod_container_name] action: replace target_label: cluster - source_labels: [__meta_kubernetes_pod_container_port_number] regex: 9284 action: keep - source_labels: [__meta_kubernetes_pod_node_name] action: replace target_label: kubernetes_node_name Target Labels
  • 12. ➔ Hystrix is great for real time monitoring. ➔ Helps in quickly identifying failures. ➔ We capture hystrix data to prometheus. ➔ Help in debugging/retrospectives Metrics source: Hystrix app statsd-exporter prometheus turbine
  • 13. ➔ Custom metrics with Hystrix Metrics source: Hystrix scrape_configs: - job_name: 'hystrix-stats' scrape_interval: '15s' file_sd_configs: - files: - 'hystrix-stats*.yml' refresh_interval: 5m metric_relabel_configs: - source_labels: [__name__] regex: '^(.*)_hystrix.*' target_label: hcluster replacement: '${1}' - source_labels: [__name__] regex: '^.*_(hystrix.*)' target_label: __name__ replacement: '${1}' - source_labels: [__name__] regex: '^hystrix_([^_]+)_.*' target_label: command replacement: '${1}' - source_labels: [__name__] regex: '^hystrix_[^_]+_(.*)' target_label: __name__ replacement: 'hystrix_${1}'
  • 14. ➔ Official client https://github.com/prometheus/client_golang ➔ Support 4 metric types (counter, gauge, histogram, summary) ➔ Built-in support Common gRPC metrics ➔ Exposed in http://ip:port/metrics Metrics source: Custom
  • 15. Alerting : Alertmanager Prometheus alertmanager slack victorops Alerting process → Alert rule →
  • 16. Thank you! Q & A We’re hiring, visit careers.carousell.com