Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기

2.143 visualizaciones

Publicado el

기존에 저희 회사에서 사용하던 모니터링은 Zabbix 였습니다.
컨테이너 모니터링 부분으로 옮겨가면서 변화가 필요하였고, 이에 대해서 프로메테우스를 활용한 모니터링 방법을 자연스럽게 고민하게 되었습니다.
이에 이영주님께서 테크세션을 진행하였고, 이에 발표자료를 올립니다.
5개의 부분으로 구성되어 있으며, 세팅 방법에 대한 내용까지 포함합니다.
01. Prometheus?
02. Usage
03. Alertmanager
04. Cluster
05. Performance

Publicado en: Software
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기

  1. 1. Open Source Consulting 국내 최고의 오픈소스 전문기업 Private/Public Cloud | Data Center to Cloud | Atlassian H. www.osci.kr T. 02-516-0711 F. 02-516-0722 서울특별시 강남구 테헤란로83길 32 나라키움삼성동A빌딩 5층 Copyright 2019 Open Source Consulting Inc. All rights reserved.
  2. 2. 이 영주 Prometheus 2019.04.17
  3. 3. Contents 01. Prometheus? 02. Usage 03. Alertmanager 04. Cluster 05. Performance
  4. 4. 01. Prometheus?
  5. 5. 01. Prometheus? • Prometheus? ⚫ 2012년 SoundCloud에서 몇몇의 개발자와 함께 시작. ⚫ 2016년 CNCF(Cloud Native Computing Foundation)의 두번 째 Memb er. ⚫ PromQL이라는 자체언어를 이용해서 빠르게 검색가능. ⚫ Kubernates의 모니터링에 많이 쓰이게 되면서 각광받게 됨. ⚫ 초당 수백만 쿼리를 수행 할 수 있게 디자인 됨. ⚫ 기존 Monitoring system보다 성능이 월등히 좋음. ⚫ Openstack, AWS, Azure, GCE등 거의 모든 Platform 모니터링 가능.
  6. 6. 01. Prometheus? • Prometheus?
  7. 7. 01. Prometheus? • Monitoring? • Alerting ⚫ 일이 잘못 되었을 때 사람에게 알리는 것. ⚫ E-mail, Slack, ... • Debugging ⚫ 문제원인을 파악 하는 것. • Trending ⚫ 사용량을 예측하여 계획에 반영.
  8. 8. 01. Prometheus? • Categories of Monitoring • Profiling ⚫ tcpdump ... • Tracing ⚫ OpenZipkin, Jaeger ... • Logging ⚫ elasticsearch, Graylog • Metric ⚫ Prometheus, Zabbix
  9. 9. 01. Prometheus? • Prometheus Architecture Target을 찾아서 자동으로 등록!! push 방식 간접구현 (App이 여기에 metric을 push) Prometheus가 이해할 수 있는 format으로 바꿔줌. 사람에게 알림을 보내주는 역할 Graph를 그리는 역할 다른 Prometheus의 metric도 가져올 수 있음 Prometheus Federation
  10. 10. 02. Usage
  11. 11. 02. Usage • Running Prometheus [root@yj26-ovstest3 prometheus]# wget > https://github.com/prometheus/prometheus/releases/download/v2.9.1/prometheus-2.9.1.linux-amd64.tar.gz --2019-04-18 13:22:50-- https://github.com/prometheus/prometheus/releases/download/v2.9.1/prometheus-2.9.1... ... [root@yj26-ovstest3 prometheus]# tar xvzf prometheus-2.9.1.linux-amd64.tar.gz prometheus-2.9.1.linux-amd64/ prometheus-2.9.1.linux-amd64/consoles/ ... [root@yj26-ovstest3 prometheus]# cd prometheus-2.9.1.linux-amd64/ [root@yj26-ovstest3 prometheus-2.9.1.linux-amd64]# grep -iv '^$|^#|^[[:space:]]*#' prometheus.yml global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. alerting: alertmanagers: - static_configs: - targets: rule_files: scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] [root@yj26-ovstest3 prometheus-2.9.1.linux-amd64]# Binary download! Decompression Self monitoring
  12. 12. 02. Usage • Running Prometheus Prometheus start! Listen address!
  13. 13. 02. Usage • Running Node-exporter [root@yj26-ovstest3 temp]# wget > https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz [root@yj26-ovstest3 temp]# tar xvzf node_exporter-0.17.0.linux-amd64.tar.gz [root@yj26-ovstest3 temp]# cd node_exporter-0.17.0.linux-amd64/ [root@yj26-ovstest3 node_exporter-0.17.0.linux-amd64]# ./node_exporter ... INFO[0000] - uname source="node_exporter.go:97" INFO[0000] - vmstat source="node_exporter.go:97" INFO[0000] - xfs source="node_exporter.go:97" INFO[0000] - zfs source="node_exporter.go:97" INFO[0000] Listening on :9100 source="node_exporter.go:111" 기본설정 port 9100
  14. 14. 02. Usage • Running Node-exporter [root@yj26-ovstest3 prometheus-2.9.1.linux-amd64]# grep -iv '^$|^#|[[:space:]]*#' prometheus.yml global: alerting: alertmanagers: - static_configs: - targets: rule_files: scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'node3' static_configs: - targets: ['localhost:9100'] [root@yj26-ovstest3 prometheus-2.9.1.linux-amd64]# ps aux |grep -i prometheus root 12286 0.0 2.2 157280 41652 pts/1 Sl+ 13:41 0:04 ./prometheus root 12520 0.0 0.0 116812 1032 pts/3 S+ 15:25 0:00 grep --color=auto -i prometheus [root@yj26-ovstest3 prometheus-2.9.1.linux-amd64]# kill -SIGHUP 12286 Log ... ... caller=main.go:724 msg="Loading configuration file" filename=prometheus.yml ... caller=main.go:751 msg="Completed loading of configuration file" filename=prometheus.yml node exporter target 추가!! 1번 시그널을 보내서 config reload!! Config reload 성공!!
  15. 15. 02. Usage • Scraping [root@yj26-ovstest3 ~]# curl localhost:9090/metrics |head -n 20 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 41814 0 41814 0 0 5023k 0 --:--:-- --:--:-- --:--:-- 5833k # HELP go_gc_duration_seconds A summary of the GC invocation durations. # TYPE go_gc_duration_seconds summary go_gc_duration_seconds{quantile="0"} 1.2143e-05 go_gc_duration_seconds{quantile="0.25"} 2.9441e-05 go_gc_duration_seconds{quantile="0.5"} 9.6832e-05 go_gc_duration_seconds{quantile="0.75"} 0.000199094 go_gc_duration_seconds{quantile="1"} 0.000424251 go_gc_duration_seconds_sum 0.000761761 go_gc_duration_seconds_count 5 # HELP go_goroutines Number of goroutines that currently exist. # TYPE go_goroutines gauge go_goroutines 38 # HELP go_info Information about the Go environment. # TYPE go_info gauge go_info{version="go1.12.4"} 1 # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use. # TYPE go_memstats_alloc_bytes gauge go_memstats_alloc_bytes 1.385236e+07 # HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed. # TYPE go_memstats_alloc_bytes_total counter [root@yj26-ovstest3 ~]# Metric name Label Time series go_gc_duration_seconds의 Cardinality는 7 Metric Type Description of metric
  16. 16. 02. Usage • PromQL Internal Fuction Time series name Selector Range Sample Instant vector Access URL Metric name kvm2 를 제외한 node의 eth0에서 3분동안 수신한 traffic byte 총량의 1초당 평균 변화량 = Network 사용률
  17. 17. 02. Usage • PromQL
  18. 18. 02. Usage • Grafana
  19. 19. 03. Alertmanager
  20. 20. 03. Alertmanager • Alertmanager Architecture
  21. 21. 03. Alertmanager • Running Alertmanager [root@yj26-ovstest3 alertmanager]# wget > https://github.com/prometheus/alertmanager/releases/download/v0.16.2/alertmanager-0.16.2.linux-amd64.tar.gz [root@yj26-ovstest3 alertmanager]# cd alertmanager-0.16.2.linux-amd64/ [root@yj26-ovstest3 alertmanager-0.16.2.linux-amd64]# grep -iv '^$|^#|^[[:space:]]*#' alertmanager.yml route: group_by: [Alertname] receiver: email-me receivers: - name: email-me email_configs: - to: leeyj7141@gmail.com from: leeyj7141@gmail.com smarthost: smtp.gmail.com:587 auth_username: "leeyj7141@gmail.com" auth_identity: "leeyj7141@gmail.com" auth_password: "xxxxxxxxxxxxxxxxxx" [root@yj26-ovstest3 alertmanager-0.16.2.linux-amd64]# ./alertmanager ... caller=main.go:177 msg="Starting Alertmanager" version="(version=0.16.2, branch=HEAD, revision=308b7620642dc147794e6686a3f94d1b6fc8ef4d ... caller=main.go:178 build_context="(go=go1.11.6, user=root@1e9a48272b38, date=20190405-12:27:40)" ... caller=cluster.go:161 component=cluster msg="setting advertise address explicitly" addr=10.26.1.13 port=9094 ... caller=cluster.go:632 component=cluster msg="Waiting for gossip to settle..." interval=2s ... caller=main.go:334 msg="Loading configuration file" file=alertmanager.yml ... caller=main.go:428 msg=Listening address=:9093 ... caller=cluster.go:657 component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.000210783s ... caller=cluster.go:649 component=cluster msg="gossip settled; proceeding" elapsed=10.001438416s Google App password
  22. 22. 03. Alertmanager • Running Alertmanager [root@yj26-ovstest3 ~]# grep -iv '^$|^#|^[[:space:]]*#' > prometheus/prometheus-2.9.1.linux-amd64/prometheus.yml global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. alerting: alertmanagers: - static_configs: - targets: - localhost:9093 rule_files: - rules.yml scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'node3' static_configs: - targets: ['localhost:9100'] [root@yj26-ovstest3 ~]# grep -iv '^$|^#|^[[:space:]]*#' ~/prometheus/prometheus-2.9.1.linux-amd64/rules.yml groups: - name: example rules: - alert: InstanceDown expr: up{instance="localhost:9100",job="node3"} == 0 for: 1m [root@yj26-ovstest3 ~]# [root@yj26-ovstest3 ~]# ps aux |grep -i prometheus root 12286 0.0 2.3 157424 44836 pts/1 Sl+ 13:41 0:10 ./prometheus [root@yj26-ovstest3 ~]# kill -1 12286 Alertmanager 위치 아까 추가한 node-exporter
  23. 23. 03. Alertmanager • Running Alertmanager [root@yj26-ovstest3 ~]# pkill node_exporter [root@yj26-ovstest3 ~]# ps aux |grep -i node root 12838 0.0 0.0 116812 1028 pts/2 S+ 17:34 0:00 grep --color=auto -i node [root@yj26-ovstest3 ~]#
  24. 24. 03. Alertmanager • Running Alertmanager
  25. 25. 03. Alertmanager • Running Alertmanager
  26. 26. 03. Alertmanager • Running Alertmanager
  27. 27. 04. Cluster
  28. 28. 04. Cluster • Monitoring Server가 죽으면 어쩌지??? 동일한 역할을 하는 Prometheus를 추가!
  29. 29. 04. Cluster • Alertmanager가 죽으면??? Gossip network가 끊어지면 ?? 알림을 두개씩 받는다. 하나도 못받는거 보단 나음.
  30. 30. 05. Performance
  31. 31. 05. Performance • Hardware ⚫ 1개의 Sample을 압축 하면 약 1.3 bytes 정도의 storage 소모 ⚫ 기본설정 15일간의 data를 남기고 초당 10만 sample을 저장한다고 한다면 Storag e 약 240GB 정도 소모 ⚫ 초당 10만 sample정도 처리하는데 CPU 약 0.25개 정도 소모. ⚫ Query, Recording rule, Go gabege collection 까지 생각하면 +1 개 ⚫ CPU는 1.25개면 충분! ⚫ 초당 10만 sample정도에 Memory는 약 8GB면 충분. ⚫ Prometheus는 scrap 시 압축을 해서 받기에 1개의 sample당 Network ba ndwidth 20 bytes 정도 소모 ⚫ 초당 10만 sample을 처리하는데 Network bandwidth는 약 16Mbps 소 모. Node exporter 1개 약 3000 Time series Node 100대 기본 15초에 1번씩 scrap = 20000/s
  32. 32. 05. Performance • Reducing Cardinality Cardinality가 높은 순으로 metric을 나열한 것.
  33. 33. 05. Performance • Recording rule [root@yj26-ovstest1 prometheus-2.8.1.linux-amd64]# grep -iv '^$|^#|[[:space:]]*#' rules.yml groups: - name: node rules: - record: job:node_cpu_seconds_total:rate3m expr: > 100 - (avg by (instance) (rate(node_cpu_seconds_total{job="node",mode="idle"}[3m])) * 100) [root@yj26-ovstest1 prometheus-2.8.1.linux-amd64]# CPU 사용률 계산식
  34. 34. 05. Performance • Recording rule
  35. 35. 05. Performance • Recording rule
  36. 36. 05. Performance • Target이 너무 많아졌을때는???
  37. 37. 05. Performance • Target이 너무 많아 졌을때는??? Horizontal Sharding!!
  38. 38. Thank you.

×