SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Setting up Monitoring System for Alluxio
with Prometheus and Grafana in 10 minutes
Pan Liu @ Tencent
2021.06.24
• Pan Liu
• Tencent
• Presto on Alluxio
Contents
• Part 1. How Alluxio Metrics System
Works
• Part 2. How to Implement a custom
Alluxio Sink
• Part 3. How to Set Up Monitoring of
Alluxio in 10 mins
Part 1. How Alluxio Metrics System Works
Alluxio Metrics System
• Framework
• Types of metrics
• Name metrics
• Flow of metrics from worker to master
Alluxio Metrics System
• Framework
Source1
Metrics
system
Sink1
Source2 Sink2
… …
Alluxio Metrics System
• Two Types of Metrics:
• Cluster Metrics : Aggegated from Workers and Clients
• Process Metrics: Collected by Each Alluxio Process
Master
Cluster Metrics
Process Metrics
Worker
Worker
Worker
Client
Client
Alluxio Metrics System
• Metrics Name:
• Master: Master.[metricName].[tag1].[tag2]...
• Non-Master: [processType].[metricName].[tag1].[tag2]...[hostName]
Master.GetFileInfoOps
Worker.OpenExistingFile.User:user.UFS:hdfs:%2F%2F9%2E135%2E88%2E92:9000%2F.UFS_TYPE:hdfs.worker_host
Alluxio Metrics System
MetricMaster
BlockMaster
processWorkerMetric
workerHeartbeat
MetricsStore putWorkerMetrics
RpcServer
BlockWorker
BlockMasterSync
BlockMasterClient
heartbeat
RpcClient
MasterProcess WorkerProcess
heartbeat
blockHeartbeat
blockHeartbeat
Part 2. How to Implement a custom Sink of Alluxio
Alluxio Sinks
• Passive VS. Active
• Passive: service PULL
• Active: report periodically PUSH
Active Passive
Alluxio Sinks
Active: Report periodically
Master_host:port/metrics/json
./logs/master.out
Passive: Scratch from server
Active Sink: ConsoleSink
…
…
Passive Sink : PrometheusMetricsServlet
• getHandler() called by master or worker
• Work as service
• Metrics are available only when requested
E.g. MasterProcess
Sink Extension
• Passive or Active ?
• XmlSink as an example of Active type: Print metrics in XML format to a specified path.
• Implement Sink interface
• Construct a XmlReporter to report metrics in XML format
• Config to enable XmlSink
conf/metrics.properties
Part 3. How to Set Up Monitoring for alluxio in 10
minutes
Alluxio Web UI Monitoring
Web UI metrics page
Convenient but doesn’t work well sometimes…
Alluxio Monitoring with Prometheus and Grafana
How it works ?
1. Prometheus scrapes metrics from Alluxio servers and transforms to time series data
2. Grafana server get metrics using the PromQL
3. Grafana web UI displays metrics in dashboards
What we need to do ?
1. Install and start Prometheus and Grafana Server
2. Add Alluxio Jobs to Prometheus
3. Download and import Grafana dashboard
4. Modify variables of dashboard template
Alluxio Monitoring with Prometheus and Grafana
prometheus.yml Version:
- Prometheus Version: 2.22.2
- Grafana Version: 7.5.6
- Alluxio Version: 2.5.0-3
targets/local/masters/master_01.yml
targets/local/workers/worker_01.yml
Services:
- Prometheus Server @ master:9090
- Grafana Server @ worker:3000
- Alluxio Master @ master
- Alluxio Worker @ worker
Alluxio Monitoring with Prometheus and Grafana
Alluxio Monitoring with Prometheus and Grafana
• Alluxio IO Key Metrics
• Read Local & Read Remote & Alive workers …
• Storage
• Space Used & UFS Space Used …
• Workers Blocks
• Cached Blocks & Evicted Blocks …
• Logical Operations
• Mount Operations & File Pinned …
• Alluxio Metadata Operations
• Block Hearbeat Cost & Get status Cost …
• AsyncCache Blocks & Operations
• AsyncCacheSuccessedBlocks & AysncCacheFailedBlocks …
• Master JVM Memory
• Master Heap Memory & Total Memory …
Alluxio Monitoring with Prometheus and Grafana
• Alluxio docs:
https://docs.alluxio.io/os/user/edge/en/operation/Metrics-
System.html#grafana-web-ui-with-prometheus
• Grafana dashboards:
https://grafana.com/grafana/dashboards/13467
Thanks!
panyliu@tencent.com

Más contenido relacionado

La actualidad más candente

Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio, Inc.
 
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...Alluxio, Inc.
 
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageOpen Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageAlluxio, Inc.
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit
 
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017Alluxio, Inc.
 
Accessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAccessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAlluxio, Inc.
 
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016Alluxio, Inc.
 
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016Alluxio, Inc.
 
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Alluxio, Inc.
 
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road AheadAlluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road AheadAlluxio, Inc.
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio, Inc.
 
Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Alluxio, Inc.
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio, Inc.
 
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Jiří Šimša
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementAlluxio, Inc.
 
CoreOS fest 2016 Summary - DevOps BP 2016 June
CoreOS fest 2016 Summary - DevOps BP 2016 JuneCoreOS fest 2016 Summary - DevOps BP 2016 June
CoreOS fest 2016 Summary - DevOps BP 2016 JuneZsolt Molnar
 
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...Alluxio, Inc.
 
Enabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaEnabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaNETWAYS
 
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...Puppet
 

La actualidad más candente (20)

Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
 
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
 
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageOpen Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed Storage
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
 
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
 
Accessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAccessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified Namespace
 
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016
 
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
 
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
 
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road AheadAlluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18
 
Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
 
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data management
 
CoreOS fest 2016 Summary - DevOps BP 2016 June
CoreOS fest 2016 Summary - DevOps BP 2016 JuneCoreOS fest 2016 Summary - DevOps BP 2016 June
CoreOS fest 2016 Summary - DevOps BP 2016 June
 
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...
CNCF Member Webinar: Improving Data Locality for Analytics Jobs on Kubernetes...
 
Enabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaEnabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebula
 
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
 
OpenStack Heat
OpenStack HeatOpenStack Heat
OpenStack Heat
 

Similar a Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 minutes

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Azure Functions in Action #CodePaLOUsa
Azure Functions in Action #CodePaLOUsaAzure Functions in Action #CodePaLOUsa
Azure Functions in Action #CodePaLOUsaBaskar rao Dsn
 
Monitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorMonitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorLili Cosic
 
Hands-on monitoring with Prometheus
Hands-on monitoring with PrometheusHands-on monitoring with Prometheus
Hands-on monitoring with PrometheusBrice Fernandes
 
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubMuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubAlfonso Martino
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET Journal
 
Chef Analytics Webinar
Chef Analytics WebinarChef Analytics Webinar
Chef Analytics WebinarJames Casey
 
2019 hashiconf seattle_consul_ioc
2019 hashiconf seattle_consul_ioc2019 hashiconf seattle_consul_ioc
2019 hashiconf seattle_consul_iocPierre Souchay
 
Advanced Orchestration & Automation
Advanced Orchestration & AutomationAdvanced Orchestration & Automation
Advanced Orchestration & AutomationLuc Raeskin
 
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...Tony Erwin
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
 
Play with azure functions
Play with azure functionsPlay with azure functions
Play with azure functionsBaskar rao Dsn
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPiyush Kumar
 
Leveraging Analytics for DevOps
Leveraging Analytics for DevOpsLeveraging Analytics for DevOps
Leveraging Analytics for DevOpsMichael Floyd
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzardkevin_donovan
 
qTest <> TestProject Integration Webinar
qTest <> TestProject Integration WebinarqTest <> TestProject Integration Webinar
qTest <> TestProject Integration WebinarKevin Dunne
 

Similar a Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 minutes (20)

Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Azure Functions in Action #CodePaLOUsa
Azure Functions in Action #CodePaLOUsaAzure Functions in Action #CodePaLOUsa
Azure Functions in Action #CodePaLOUsa
 
Monitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorMonitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operator
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Prometheus workshop
Prometheus workshopPrometheus workshop
Prometheus workshop
 
Hands-on monitoring with Prometheus
Hands-on monitoring with PrometheusHands-on monitoring with Prometheus
Hands-on monitoring with Prometheus
 
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHubMuleSoft Meetup Roma - Processi di Automazione su CloudHub
MuleSoft Meetup Roma - Processi di Automazione su CloudHub
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
 
Chef Analytics Webinar
Chef Analytics WebinarChef Analytics Webinar
Chef Analytics Webinar
 
2019 hashiconf seattle_consul_ioc
2019 hashiconf seattle_consul_ioc2019 hashiconf seattle_consul_ioc
2019 hashiconf seattle_consul_ioc
 
Advanced Orchestration & Automation
Advanced Orchestration & AutomationAdvanced Orchestration & Automation
Advanced Orchestration & Automation
 
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...
Monitoring Node.js Microservices on CloudFoundry with Open Source Tools and a...
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
 
Play with azure functions
Play with azure functionsPlay with azure functions
Play with azure functions
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery Talk
 
Leveraging Analytics for DevOps
Leveraging Analytics for DevOpsLeveraging Analytics for DevOps
Leveraging Analytics for DevOps
 
New relic in action at trainline
New relic in action at trainlineNew relic in action at trainline
New relic in action at trainline
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzard
 
qTest <> TestProject Integration Webinar
qTest <> TestProject Integration WebinarqTest <> TestProject Integration Webinar
qTest <> TestProject Integration Webinar
 

Más de Alluxio, Inc.

Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 

Más de Alluxio, Inc. (20)

Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 

Último

BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AIAGATSoftware
 
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphGraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphNeo4j
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaNeo4j
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfkalichargn70th171
 
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Lisi Hocke
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Conceptsthomashtkim
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit MilanNeo4j
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringPrakhyath Rai
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)Roberto Bettazzoni
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...Neo4j
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024MulesoftMunichMeetup
 
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypseTomasz Kowalczewski
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNeo4j
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024SimonedeGijt
 
Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024Chirag Panchal
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Eraconfluent
 

Último (20)

BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with GraphGraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements Engineering
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 

Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 minutes

  • 1. Setting up Monitoring System for Alluxio with Prometheus and Grafana in 10 minutes Pan Liu @ Tencent 2021.06.24
  • 2. • Pan Liu • Tencent • Presto on Alluxio
  • 3. Contents • Part 1. How Alluxio Metrics System Works • Part 2. How to Implement a custom Alluxio Sink • Part 3. How to Set Up Monitoring of Alluxio in 10 mins
  • 4. Part 1. How Alluxio Metrics System Works
  • 5. Alluxio Metrics System • Framework • Types of metrics • Name metrics • Flow of metrics from worker to master
  • 6. Alluxio Metrics System • Framework Source1 Metrics system Sink1 Source2 Sink2 … …
  • 7. Alluxio Metrics System • Two Types of Metrics: • Cluster Metrics : Aggegated from Workers and Clients • Process Metrics: Collected by Each Alluxio Process Master Cluster Metrics Process Metrics Worker Worker Worker Client Client
  • 8. Alluxio Metrics System • Metrics Name: • Master: Master.[metricName].[tag1].[tag2]... • Non-Master: [processType].[metricName].[tag1].[tag2]...[hostName] Master.GetFileInfoOps Worker.OpenExistingFile.User:user.UFS:hdfs:%2F%2F9%2E135%2E88%2E92:9000%2F.UFS_TYPE:hdfs.worker_host
  • 9. Alluxio Metrics System MetricMaster BlockMaster processWorkerMetric workerHeartbeat MetricsStore putWorkerMetrics RpcServer BlockWorker BlockMasterSync BlockMasterClient heartbeat RpcClient MasterProcess WorkerProcess heartbeat blockHeartbeat blockHeartbeat
  • 10. Part 2. How to Implement a custom Sink of Alluxio
  • 11. Alluxio Sinks • Passive VS. Active • Passive: service PULL • Active: report periodically PUSH Active Passive
  • 12. Alluxio Sinks Active: Report periodically Master_host:port/metrics/json ./logs/master.out Passive: Scratch from server
  • 14. Passive Sink : PrometheusMetricsServlet • getHandler() called by master or worker • Work as service • Metrics are available only when requested E.g. MasterProcess
  • 15. Sink Extension • Passive or Active ? • XmlSink as an example of Active type: Print metrics in XML format to a specified path. • Implement Sink interface • Construct a XmlReporter to report metrics in XML format • Config to enable XmlSink conf/metrics.properties
  • 16. Part 3. How to Set Up Monitoring for alluxio in 10 minutes
  • 17. Alluxio Web UI Monitoring Web UI metrics page Convenient but doesn’t work well sometimes…
  • 18. Alluxio Monitoring with Prometheus and Grafana How it works ? 1. Prometheus scrapes metrics from Alluxio servers and transforms to time series data 2. Grafana server get metrics using the PromQL 3. Grafana web UI displays metrics in dashboards What we need to do ? 1. Install and start Prometheus and Grafana Server 2. Add Alluxio Jobs to Prometheus 3. Download and import Grafana dashboard 4. Modify variables of dashboard template
  • 19. Alluxio Monitoring with Prometheus and Grafana prometheus.yml Version: - Prometheus Version: 2.22.2 - Grafana Version: 7.5.6 - Alluxio Version: 2.5.0-3 targets/local/masters/master_01.yml targets/local/workers/worker_01.yml Services: - Prometheus Server @ master:9090 - Grafana Server @ worker:3000 - Alluxio Master @ master - Alluxio Worker @ worker
  • 20.
  • 21. Alluxio Monitoring with Prometheus and Grafana
  • 22. Alluxio Monitoring with Prometheus and Grafana • Alluxio IO Key Metrics • Read Local & Read Remote & Alive workers … • Storage • Space Used & UFS Space Used … • Workers Blocks • Cached Blocks & Evicted Blocks … • Logical Operations • Mount Operations & File Pinned … • Alluxio Metadata Operations • Block Hearbeat Cost & Get status Cost … • AsyncCache Blocks & Operations • AsyncCacheSuccessedBlocks & AysncCacheFailedBlocks … • Master JVM Memory • Master Heap Memory & Total Memory …
  • 23. Alluxio Monitoring with Prometheus and Grafana • Alluxio docs: https://docs.alluxio.io/os/user/edge/en/operation/Metrics- System.html#grafana-web-ui-with-prometheus • Grafana dashboards: https://grafana.com/grafana/dashboards/13467