SlideShare una empresa de Scribd logo
1 de 21
Argus Production
Monitoring At
Salesforce
Service Health & Observability at Scale
Tom Valine
Director, Infrastructure Engineering
tvaline@salesforce.com
in/tvaline
Bhinav Sura
Software Engineer, Infrastructure Engineering
bhinav.sura@salesforce.com
in/bhinavsura
What is Argus?
● Time Series Data & Events
● Inbuilt Service Protection
● Alerting
● Flexible Dashboarding
● Full REST API
● High Throughput
● Low Latency
● Horizontally Scalable
● In Use By
○ Capacity Planning
○ Search
○ Feature Teams
○ Site Reliability
○ Customer Success
But Why Another Monitoring System?
● Technology changes
frequently!
● Insulate our customers
● Performance
● Trust
● Programmatic access for
everything
● Multi-tenancy
● Correlation with non-
timeseries data
● Highly dimensional
I’ve seen this somewhere before...
Metrics
● Transforms
● Namespace
● Scope
● Name
● Tags
● Aggregator
● Downsampler
Events
● Namespace
● Scope
● Name
● Tags
● Type
● User
SCALE(-2d:-1d:dva:argus:freemem{host=*}:min:1d-min, $1e-6)
TRANSFORM
START
END
NAMESPACE
SCOPE
METRIC
TAGS
AGG
DS
PARAMS
-2d:-1d:dva:argus:release{host=*}:major:admin
START
END
NAMESPACE
SCOPE
NAME
TAGS
TYPE
USER
● First Class Data
● Decoupled from Time
Series
● Multiple Events Per
Timestamp
● Event Categories
● Identifiable per User
● Overlay on Any Time
Series
Events
Alerting
● CRON Format
● Alert on Missing Data
● Single Ended & Range
Comparisons
● Inertia
● Cooldown
● Multiple Triggers
● Multiple Notifications
○ Audit
○ Email
○ GOC++
○ Salesforce Chatter
○ PagerDuty
● Event Backannotation
Warden
● Policy Driven Suspension
Mechanism
● Per User
● Application & Subsystem
● Progressively Punitive
● Indefinite Suspension
Supported
● Customizeable
Dashboarding
● Maintaining dashboards is
a horrible business to be
in
● Empower the users, get
out of their way
● Markup based
● Custom tags for
visualization elements
● HTML for everything else
REST
● API First
● All functionality exposed
via services
● Decoupled UI
● Authenticated
○ Login
○ Do stuff
○ Logout
● Get out of User's Way!
○ Orchestra Client
○ ArgusPoke
○ Dashboard Creation
Tool
How does it work?
METRICS ANNOTATION USER ENTITY
ALERTS MAIL SCHEDULING MONITORING
WEB SERVICES
AUTH ORM MQ TSDB
WEB UI CUSTOM APPS OTHER CLIENTS
DASHBOARD MANAGEMENT WARDEN NAMESPACE
SCHEMA WILDCARDING CACHING INTERLOCK
Okay, but how does it REALLY work?
MESSAGE BUS
HBASE/TSDB/RDBMS/CACHING
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
C
L
CO
RE
W
S
Cool, how will it evolve going forward?
HBASE/TSDB/RDBMS/CACHE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
UI
W
S
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
CO
RE
W
S
HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE
ROUTE/FORK/JOIN+M/R
ROUTE/FORK/JOIN+M/R
MESSAGE BUS MESSAGE BUS MESSAGE BUS MESSAGE BUS MESSAGE BUS
ROUTE/FORK/JOIN+M/R
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
C
L
Alert Evaluation Data Flows
Message Queue:
1. Scheduling Service updates
alert schedule every 10 minutes.
2. Scheduler submits scheduled
jobs to queue
3. Minimum interval of 1 minute
Alert Client:
1. Dequeues from alert queue.
2. Query ranges adjusted for
scheduling latency
3. Triggers evaluated
4. Notifications sent
5. Cooldowns updated.
ALERT DATA STORE
SCHEUDLING
SERVICE
ALERT CACHE
ARGUS WS
ALERT 8713
...
ALERT 4141
ALERT 9810
Metric & Event Data Flows
Message Queue:
1. Writes are asynchronous with high
degree of parallelism.
2. Queue used as a shock absorber.
Tolerant to lower level
failures/downtime.
3. Kafka for scalability. One topic each
for metrics and annotations.
Number of partitions in the order of
100s.
ArgusMetricsQueue:
1. Consumed by 2 types of clients:
MetricCommit and SchemaCommit
2. MetricCommit client commits the
actual time series data to persistent
storage (using OTSDB or Phoenix).
3. SchemaCommit client only uses the
metric metadata to create metric
schema records and commits them
to HBase (using AsyncHBase).
TIMESERIES STORE
ARGUS WS
METRIC
...
METRIC
METRIC
METRIC SERVICE
SCHEMA STORE
TSDB Service Implementation - OpenTSDB
● Uses HBase underneath
● RowKey: <metric_uid><timestamp><tagk1><tagv1>[...<tagkn><tagvn>].
● Stores actual time series values on hourly boundaries (All values within an hour stored in the
same cell)
● Pros:
○ Extremely fast when you query using complete metric name.
○ 5M datapoints/min write throughput per write daemon.
● Cons:
○ Tag Cardinality - Total number of tags per metric is limited to 8
○ Tag Cardinality - As product of tag values across all tag keys increases, performance decreases
drastically
○ UID Exhaustion - 16M UIDs each for metric, tagk and tagv names by default. Once these are
exhausted, no new metrics, tagk or tagv can be created.
TSDB Service Implementation - Phoenix
● Uses HBase underneath
● RowKey: <metric_uid><timestamp><tagv1>[...<tagvn>].
● Metric modeled as Phoenix VIEW
○ Schema is introspectable and managed outside of data
○ Supports secondary indexes on value and/or tag(s)
● Parallelizes query and pushes computation to server
○ Server-side aggregation conserves network bandwidth
○ Allows SKIP_SCAN filter optimization for minimizing data scanned
○ Leverages ROW_TIMESTAMP optimization for filtering HFiles
● Performance on par or better than OpenTSDB
● Ad hoc SQL query capability
○ Join against other Phoenix tables
● Longer term leverage Drillix (Phoenix + Drill)
○ Cross cluster queries
○ Joins to other non HBase data sources
Schema Service Motivation
● Discover Metrics
○ What all metrics exist within a scope?
○ For a given <scope, metric> combination, what all tags exist?
○ Given a metric, what all scopes contain this metric?
○ What are all the tag values that exist for a given tag key?
● Support Wildcard Queries
○ Non-wildcard query
■ -1h:system.myDatacenter.myPod:Cpu.perc:avg:1m-avg
○ Wildcard query
■ -1h:system.myDatacenter.*:Cpu.perc:avg:1m-avg
■ -1h:system.myDatacenter.myPod:Cpu*:avg:1m-avg
■ -1h:system.myDatacenter.myPod:Cpu.perc{device=*app*}:avg:1m-avg
Schema Service Implementation
● AsyncHBase Schema Service:
○ Uses HBase underneath
○ SchemaRecord: namespace, scope, metricname, tagk, tagv. No data points.
○ Each record indexed in 2 ways in 2 different tables.
○ MetricIndexed schema table:
■ RowKey: <metricname><scope><namespace><tagk><tagv>
○ ScopeIndexed schema table:
■ RowKey: <scope><metricname><namespace><tagk><tagv>
○ Decide what table to use based on the type of query.
○ Pros:
■ Efficient retrieval for schema records for most types of queries
○ Cons:
■ Storage duplication
● DiscoveryService:
○ Uses SchemaService internally
○ Ability to filter records by type
■ For e.g. Filter all unique scopes that match *myScope*
○ Expand Wildcard query and return a collection of non-wildcard queries
Caching
● CachedTSDB Service:
○ Uses RedisCache service and the configured TSDBService implementation (OpenTSDB or
PhoenixTSDB)
○ Query Level Caching (caches synthetic data)
○ Caches data spanning a window of more than last 24 hours.
○ Data is cached by fracturing it on day boundary.
■ For e.g.: Query spanning 5 days is stored using 5 keys on the cache.
○ Support for partial hits
○ Cache expiry time of an hour (can be increased by running a separate Cache update process)
● CachedDiscovery Service:
○ Uses RedisCache service and the configured DiscoveryService implementation
○ Cache queries already expanded
○ Cache expiry time of a day
Developed By
● Anand Subramanian
● Bhinav Sura
● Tom Valine
● Jigna Bhatt
● Ruofan Zhang
● Dilip Devaraj
● Raj Sarkapally
● Kiran Gowdru
More Information
​https://github.com/SalesforceEng/Argus
thank y u

Más contenido relacionado

La actualidad más candente

HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusMichael Stack
 
Tales from Taming the Long Tail
Tales from Taming the Long TailTales from Taming the Long Tail
Tales from Taming the Long TailHBaseCon
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudMichael Stack
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyondMatija Gobec
 
Voldemort on Solid State Drives
Voldemort on Solid State DrivesVoldemort on Solid State Drives
Voldemort on Solid State DrivesVinoth Chandar
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseMichael Stack
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchCloudera, Inc.
 
Hadoop Networking at Datasift
Hadoop Networking at DatasiftHadoop Networking at Datasift
Hadoop Networking at Datasifthuguk
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni VamvadelisAmazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelishuguk
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryDataWorks Summit
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...DataStax
 

La actualidad más candente (20)

HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
 
Tales from Taming the Long Tail
Tales from Taming the Long TailTales from Taming the Long Tail
Tales from Taming the Long Tail
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyond
 
Voldemort on Solid State Drives
Voldemort on Solid State DrivesVoldemort on Solid State Drives
Voldemort on Solid State Drives
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Google mesa
Google mesaGoogle mesa
Google mesa
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay Search
 
Hadoop Networking at Datasift
Hadoop Networking at DatasiftHadoop Networking at Datasift
Hadoop Networking at Datasift
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni VamvadelisAmazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theory
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
 

Destacado

Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the BasicsHBaseCon
 
Date-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series DataDate-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series DataHBaseCon
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0HBaseCon
 
HBaseCon 2015: HBase @ Flipboard
HBaseCon 2015: HBase @ FlipboardHBaseCon 2015: HBase @ Flipboard
HBaseCon 2015: HBase @ FlipboardHBaseCon
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory HBaseCon
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction HBaseCon
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search HBaseCon
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 

Destacado (16)

Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
 
Date-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series DataDate-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series Data
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
 
HBaseCon 2015: HBase @ Flipboard
HBaseCon 2015: HBase @ FlipboardHBaseCon 2015: HBase @ Flipboard
HBaseCon 2015: HBase @ Flipboard
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 

Similar a Argus Production Monitoring at Salesforce

MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Kevin Xu
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...ScyllaDB
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...NETWAYS
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...Rob Skillington
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLMariaDB plc
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 
Les fonctionnalites mariadb
Les fonctionnalites mariadbLes fonctionnalites mariadb
Les fonctionnalites mariadblemugfr
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaDatabricks
 
PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013Andrew Dunstan
 
What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1MariaDB plc
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 

Similar a Argus Production Monitoring at Salesforce (20)

MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQL
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 
Les fonctionnalites mariadb
Les fonctionnalites mariadbLes fonctionnalites mariadb
Les fonctionnalites mariadb
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013
 
What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1What to expect from MariaDB Platform X5, part 1
What to expect from MariaDB Platform X5, part 1
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 

Más de HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 

Más de HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 

Último

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

Último (20)

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 

Argus Production Monitoring at Salesforce

  • 1. Argus Production Monitoring At Salesforce Service Health & Observability at Scale Tom Valine Director, Infrastructure Engineering tvaline@salesforce.com in/tvaline Bhinav Sura Software Engineer, Infrastructure Engineering bhinav.sura@salesforce.com in/bhinavsura
  • 2. What is Argus? ● Time Series Data & Events ● Inbuilt Service Protection ● Alerting ● Flexible Dashboarding ● Full REST API ● High Throughput ● Low Latency ● Horizontally Scalable ● In Use By ○ Capacity Planning ○ Search ○ Feature Teams ○ Site Reliability ○ Customer Success
  • 3. But Why Another Monitoring System? ● Technology changes frequently! ● Insulate our customers ● Performance ● Trust ● Programmatic access for everything ● Multi-tenancy ● Correlation with non- timeseries data ● Highly dimensional
  • 4. I’ve seen this somewhere before... Metrics ● Transforms ● Namespace ● Scope ● Name ● Tags ● Aggregator ● Downsampler Events ● Namespace ● Scope ● Name ● Tags ● Type ● User SCALE(-2d:-1d:dva:argus:freemem{host=*}:min:1d-min, $1e-6) TRANSFORM START END NAMESPACE SCOPE METRIC TAGS AGG DS PARAMS -2d:-1d:dva:argus:release{host=*}:major:admin START END NAMESPACE SCOPE NAME TAGS TYPE USER
  • 5. ● First Class Data ● Decoupled from Time Series ● Multiple Events Per Timestamp ● Event Categories ● Identifiable per User ● Overlay on Any Time Series Events
  • 6. Alerting ● CRON Format ● Alert on Missing Data ● Single Ended & Range Comparisons ● Inertia ● Cooldown ● Multiple Triggers ● Multiple Notifications ○ Audit ○ Email ○ GOC++ ○ Salesforce Chatter ○ PagerDuty ● Event Backannotation
  • 7. Warden ● Policy Driven Suspension Mechanism ● Per User ● Application & Subsystem ● Progressively Punitive ● Indefinite Suspension Supported ● Customizeable
  • 8. Dashboarding ● Maintaining dashboards is a horrible business to be in ● Empower the users, get out of their way ● Markup based ● Custom tags for visualization elements ● HTML for everything else
  • 9. REST ● API First ● All functionality exposed via services ● Decoupled UI ● Authenticated ○ Login ○ Do stuff ○ Logout ● Get out of User's Way! ○ Orchestra Client ○ ArgusPoke ○ Dashboard Creation Tool
  • 10. How does it work? METRICS ANNOTATION USER ENTITY ALERTS MAIL SCHEDULING MONITORING WEB SERVICES AUTH ORM MQ TSDB WEB UI CUSTOM APPS OTHER CLIENTS DASHBOARD MANAGEMENT WARDEN NAMESPACE SCHEMA WILDCARDING CACHING INTERLOCK
  • 11. Okay, but how does it REALLY work? MESSAGE BUS HBASE/TSDB/RDBMS/CACHING UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE C L CO RE W S
  • 12. Cool, how will it evolve going forward? HBASE/TSDB/RDBMS/CACHE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE UI W S CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE CO RE W S HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE HBASE/TSDB/RDBMS/CACHE ROUTE/FORK/JOIN+M/R ROUTE/FORK/JOIN+M/R MESSAGE BUS MESSAGE BUS MESSAGE BUS MESSAGE BUS MESSAGE BUS ROUTE/FORK/JOIN+M/R C L C L C L C L C L C L C L C L C L C L C L C L C L C L
  • 13. Alert Evaluation Data Flows Message Queue: 1. Scheduling Service updates alert schedule every 10 minutes. 2. Scheduler submits scheduled jobs to queue 3. Minimum interval of 1 minute Alert Client: 1. Dequeues from alert queue. 2. Query ranges adjusted for scheduling latency 3. Triggers evaluated 4. Notifications sent 5. Cooldowns updated. ALERT DATA STORE SCHEUDLING SERVICE ALERT CACHE ARGUS WS ALERT 8713 ... ALERT 4141 ALERT 9810
  • 14. Metric & Event Data Flows Message Queue: 1. Writes are asynchronous with high degree of parallelism. 2. Queue used as a shock absorber. Tolerant to lower level failures/downtime. 3. Kafka for scalability. One topic each for metrics and annotations. Number of partitions in the order of 100s. ArgusMetricsQueue: 1. Consumed by 2 types of clients: MetricCommit and SchemaCommit 2. MetricCommit client commits the actual time series data to persistent storage (using OTSDB or Phoenix). 3. SchemaCommit client only uses the metric metadata to create metric schema records and commits them to HBase (using AsyncHBase). TIMESERIES STORE ARGUS WS METRIC ... METRIC METRIC METRIC SERVICE SCHEMA STORE
  • 15. TSDB Service Implementation - OpenTSDB ● Uses HBase underneath ● RowKey: <metric_uid><timestamp><tagk1><tagv1>[...<tagkn><tagvn>]. ● Stores actual time series values on hourly boundaries (All values within an hour stored in the same cell) ● Pros: ○ Extremely fast when you query using complete metric name. ○ 5M datapoints/min write throughput per write daemon. ● Cons: ○ Tag Cardinality - Total number of tags per metric is limited to 8 ○ Tag Cardinality - As product of tag values across all tag keys increases, performance decreases drastically ○ UID Exhaustion - 16M UIDs each for metric, tagk and tagv names by default. Once these are exhausted, no new metrics, tagk or tagv can be created.
  • 16. TSDB Service Implementation - Phoenix ● Uses HBase underneath ● RowKey: <metric_uid><timestamp><tagv1>[...<tagvn>]. ● Metric modeled as Phoenix VIEW ○ Schema is introspectable and managed outside of data ○ Supports secondary indexes on value and/or tag(s) ● Parallelizes query and pushes computation to server ○ Server-side aggregation conserves network bandwidth ○ Allows SKIP_SCAN filter optimization for minimizing data scanned ○ Leverages ROW_TIMESTAMP optimization for filtering HFiles ● Performance on par or better than OpenTSDB ● Ad hoc SQL query capability ○ Join against other Phoenix tables ● Longer term leverage Drillix (Phoenix + Drill) ○ Cross cluster queries ○ Joins to other non HBase data sources
  • 17. Schema Service Motivation ● Discover Metrics ○ What all metrics exist within a scope? ○ For a given <scope, metric> combination, what all tags exist? ○ Given a metric, what all scopes contain this metric? ○ What are all the tag values that exist for a given tag key? ● Support Wildcard Queries ○ Non-wildcard query ■ -1h:system.myDatacenter.myPod:Cpu.perc:avg:1m-avg ○ Wildcard query ■ -1h:system.myDatacenter.*:Cpu.perc:avg:1m-avg ■ -1h:system.myDatacenter.myPod:Cpu*:avg:1m-avg ■ -1h:system.myDatacenter.myPod:Cpu.perc{device=*app*}:avg:1m-avg
  • 18. Schema Service Implementation ● AsyncHBase Schema Service: ○ Uses HBase underneath ○ SchemaRecord: namespace, scope, metricname, tagk, tagv. No data points. ○ Each record indexed in 2 ways in 2 different tables. ○ MetricIndexed schema table: ■ RowKey: <metricname><scope><namespace><tagk><tagv> ○ ScopeIndexed schema table: ■ RowKey: <scope><metricname><namespace><tagk><tagv> ○ Decide what table to use based on the type of query. ○ Pros: ■ Efficient retrieval for schema records for most types of queries ○ Cons: ■ Storage duplication ● DiscoveryService: ○ Uses SchemaService internally ○ Ability to filter records by type ■ For e.g. Filter all unique scopes that match *myScope* ○ Expand Wildcard query and return a collection of non-wildcard queries
  • 19. Caching ● CachedTSDB Service: ○ Uses RedisCache service and the configured TSDBService implementation (OpenTSDB or PhoenixTSDB) ○ Query Level Caching (caches synthetic data) ○ Caches data spanning a window of more than last 24 hours. ○ Data is cached by fracturing it on day boundary. ■ For e.g.: Query spanning 5 days is stored using 5 keys on the cache. ○ Support for partial hits ○ Cache expiry time of an hour (can be increased by running a separate Cache update process) ● CachedDiscovery Service: ○ Uses RedisCache service and the configured DiscoveryService implementation ○ Cache queries already expanded ○ Cache expiry time of a day
  • 20. Developed By ● Anand Subramanian ● Bhinav Sura ● Tom Valine ● Jigna Bhatt ● Ruofan Zhang ● Dilip Devaraj ● Raj Sarkapally ● Kiran Gowdru More Information ​https://github.com/SalesforceEng/Argus