SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
"Monitoring as Software Validation"
Measure anything,
Measure everything

Serena Lorenzini
serena@biodec.com

Incontro DevOps Italia
Bologna, 21 Feb. 2014
Monitoring:
If it moves... you can track it!
Monitor everything

Network

Machine

Why?
●Learn from your
infrastructure
●Anticipate failure
●Speed up changes

Application
Metrics and Events
Metric: Time + Name + Value
Event: Time + Name

It can be anything
Graphite
Graphite

An all-in-one solution for
storing and visualizing real-time
time-series data

Key features:
Efficient storage and ultra-fast retrieval.
Easy!!
http://graphite.wikidot.com/
Graphite components
Graphite Web

Carbon

Whisper

The front-end of
Graphite. It
provides a
dashboard for
retrieval and
visualization of our
metrics and a
powerful plotting
API.

The core of
Graphite. Carbon
listens for data in a
format, aggregate
it and try to store it
on disk as quickly
as possible using
whisper.

The data storage.
An efficient time
series based
database.
Organization of your data
Everything in Graphite
has a path with
components delimited by
dots.
servers.hostname.metric
applications.appname.metric

Paths reflect the organization
of the data:
Pushing in your data:
Carbon configuration (and limitations)
Carbon listens for data (1) and aggregates them (2).
One can set the two specific behaviors by changing appropriate
variables in the configuration files.
1) How often your data will be collected? It needs to have the
retention time set to a specific value.
For a timespan X I want to store my data at intervals of y
(seconds/hours/days/months).
What happens if I send two metrics at the same time? Carbon
retains only the last one!
2)How do your metrics aggregate? It needs specific keywords to
apply functions to aggregate the data (e.g., “min”, “max”,
“sum”..).
Fast and flexible monitoring: StatsD
StatsD
Front-end application for
Graphite (by Etsy)
Buffers metrics locally
Aggregates the data for
us
Flushes periodically data
to Graphite
Client libraries available
in any language
Send any metric you like

import statsd
HOST = 'hostname.server.com'
PORT = 8181
PREFIX = 'myprefix'
def initialize_client(host, port, prefix):
client = statsd.StatsClient(host, port, prefix)
return client
def send_data(data_name, value, client):
client.gauge(data_name, value)
client = initialize_client(HOST, PORT, PREFIX)
…..CODE.....
send_data('Energy', 1000, client)

https://github.com/etsy/statsd/
Data Types in StatsD
Graphite usually stores the most recent data in 1-minute averaged
timestep, so when you’re looking at a graph, for each stat you
are typically seeing the average value over that minute.
Type
Counters
Timers
Gauges

Definition
Per-second rates
Event duration
Values

Sets

Unique values
passed to a key

Example
Page views
Page latency
How many views
do you have
Number of
registered users
accessing your
website
Fast and flexible monitoring: CollectD
CollectD
A unix daemon that gathers system statistics
Plugin to send metrics to Carbon
Very useful for system metrics
Application-level statistics:
StatsD

System-level statistics:
CollectD

e.g. The number of times
a function is called

e.g. the memory usage

We can combine them in
a dashboard!
Case study:
“Company A”
A project not testing friendly ...
...The Design phase was almost skipped!
We were asked to translate an existing (Matlab!) application
(into Python)
Metrics Driven Development!
Case study:
“Company A”
Task: exploring a
space of
solutions to find
the best one
Method:
Simulated annealing
Probability
Random Number
Metrics Driven Development!
Track the evolution of the process instead of
parsing a (boring) log file to (1) correlate the consequences of
having P(x) > random number and (2) visually inspect the
real-time changing of P(x) values during the simulation
Case study:
“Company B”
A project where multiple applications have to interact
in order to manage the elaboration of a
huge number of pictures every day
Case study:
“Company B”
Monitor to …
1) see the asynchronous
activation of the applications
2) gather a regular pattern
3) CHECK FOR CHANGES IN
THAT PATTERN!

Monitor your system (cpu, ram...) and
applications together to see
if the hardware suits their requirements or not
Case study:
“Company B”

Monitor your system
(cpu,ram...) and
applications together to see
if the hardware suits their
requirements or not.
E.g. picture upload time
Vs packet received/transmitted
Vs memory free/used
and so on...
Case study:
“Company B”

Database queries per second?

Async tasks currently in queue?

How is the application behaving?
Images resized and stored?
Error and warning rates?
Case study:
“Company B”

These applications are running
on several hosts and
their metrics end to the same point.
You can monitor many different servers by
looking at the same dashboard.
Testing and Monitoring
"measure twice,
cut once"-

"Cut it quickly in
several pieces and see
which fits best (now!)”

You can do both!
Testing: just once during the development
Monitoring: it keeps working once the application is
released
Testing and Monitoring
Tests are logical properties of our
application. Metrics are not. But Metrics
offer you the possibility to see what is going
on once the application/system is in
production

inevitable
Failure is not accepted
and detectable!
Monitoring
Provide
informations
✗Frequent
communication
✗Some share
decision making
✗

Dev

Free!

Ops
Wait... I don't like Graphite Web Interface!
No problem!
The world of the interfaces is
In continuous evolution
About 56,100 results
You can't optimize what you can't measure
so monitor and...

Optimize anything,
Optimize everything
Thank you for your attention!
Serena Lorenzini
serena@biodec.com

Incontro DevOps Italia
Bologna, 21 Feb. 2014

Más contenido relacionado

La actualidad más candente

ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016Ho Tien VU
 
ConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneMaarten Balliauw
 
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...confluent
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerPradeep Kilambi
 
OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?Carlos Santana
 
Monitoring Large-Scale Apache Spark Clusters at Databricks
Monitoring Large-Scale Apache Spark Clusters at DatabricksMonitoring Large-Scale Apache Spark Clusters at Databricks
Monitoring Large-Scale Apache Spark Clusters at DatabricksAnyscale
 
Monitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafaMonitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafaLama K Banna
 
Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcDevopsdays
 
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...MSAdvAnalytics
 
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Coburn Watson
 
O'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesO'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesSingleStore
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkSingleStore
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...iguazio
 
MongoDB.local DC 2018: Scaling Realtime Apps with Change Streams
MongoDB.local DC 2018: Scaling Realtime Apps with Change StreamsMongoDB.local DC 2018: Scaling Realtime Apps with Change Streams
MongoDB.local DC 2018: Scaling Realtime Apps with Change StreamsMongoDB
 
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowBusiness Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowRomain Dorgueil
 
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward
 
Deploy Deep Learning Models with TensorFlow + Lambda
Deploy Deep Learning Models with TensorFlow + LambdaDeploy Deep Learning Models with TensorFlow + Lambda
Deploy Deep Learning Models with TensorFlow + LambdaGreg Werner
 
Graphite, an introduction
Graphite, an introductionGraphite, an introduction
Graphite, an introductionjamesrwu
 
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...InfluxData
 

La actualidad más candente (20)

ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016ReactiveSummeriserAkka-ScalaByBay2016
ReactiveSummeriserAkka-ScalaByBay2016
 
ConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory lane
 
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...
Use Apache Gradle to Build and Automate KSQL and Kafka Streams (Stewart Bryso...
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out Ceilometer
 
OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?
 
Monitoring Large-Scale Apache Spark Clusters at Databricks
Monitoring Large-Scale Apache Spark Clusters at DatabricksMonitoring Large-Scale Apache Spark Clusters at Databricks
Monitoring Large-Scale Apache Spark Clusters at Databricks
 
Monitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafaMonitoring & alerting presentation sabin&mustafa
Monitoring & alerting presentation sabin&mustafa
 
Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôc
 
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
 
Telemetry Updates - Juno Edition
Telemetry Updates - Juno Edition Telemetry Updates - Juno Edition
Telemetry Updates - Juno Edition
 
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
 
O'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesO'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data Pipelines
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and Spark
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
 
MongoDB.local DC 2018: Scaling Realtime Apps with Change Streams
MongoDB.local DC 2018: Scaling Realtime Apps with Change StreamsMongoDB.local DC 2018: Scaling Realtime Apps with Change Streams
MongoDB.local DC 2018: Scaling Realtime Apps with Change Streams
 
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowBusiness Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
 
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
 
Deploy Deep Learning Models with TensorFlow + Lambda
Deploy Deep Learning Models with TensorFlow + LambdaDeploy Deep Learning Models with TensorFlow + Lambda
Deploy Deep Learning Models with TensorFlow + Lambda
 
Graphite, an introduction
Graphite, an introductionGraphite, an introduction
Graphite, an introduction
 
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...
Christoph Bussler [Google Cloud] | IoT Event Processing and Analytics with In...
 

Similar a Monitoring as Software Validation

What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandMaarten Balliauw
 
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupWhat is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupMaarten Balliauw
 
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Landon Robinson
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringDatabricks
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAmazon Web Services
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemAccumulo Summit
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data AnalysisJ Singh
 
Apache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsApache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsANKIT GUPTA
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleJim Dowling
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemYael Garten
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDitnig
 

Similar a Monitoring as Software Validation (20)

What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
 
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupWhat is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
 
Apache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsApache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analytics
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
vinay-mittal-new
vinay-mittal-newvinay-mittal-new
vinay-mittal-new
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystem
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsD
 

Más de BioDec

Genome_annotation@BioDec: Python all over the place
Genome_annotation@BioDec: Python all over the placeGenome_annotation@BioDec: Python all over the place
Genome_annotation@BioDec: Python all over the placeBioDec
 
Glusterfs: un filesystem altamente versatile
Glusterfs: un filesystem altamente versatileGlusterfs: un filesystem altamente versatile
Glusterfs: un filesystem altamente versatileBioDec
 
Cloud storage in azienda: perche` Riak ci e` piaciuto
Cloud storage in azienda: perche` Riak ci e` piaciutoCloud storage in azienda: perche` Riak ci e` piaciuto
Cloud storage in azienda: perche` Riak ci e` piaciutoBioDec
 
BioDec LinuxDay2012 Erlug
BioDec LinuxDay2012 ErlugBioDec LinuxDay2012 Erlug
BioDec LinuxDay2012 ErlugBioDec
 
Haplone In 5min
Haplone In 5minHaplone In 5min
Haplone In 5minBioDec
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company ProfileBioDec
 

Más de BioDec (6)

Genome_annotation@BioDec: Python all over the place
Genome_annotation@BioDec: Python all over the placeGenome_annotation@BioDec: Python all over the place
Genome_annotation@BioDec: Python all over the place
 
Glusterfs: un filesystem altamente versatile
Glusterfs: un filesystem altamente versatileGlusterfs: un filesystem altamente versatile
Glusterfs: un filesystem altamente versatile
 
Cloud storage in azienda: perche` Riak ci e` piaciuto
Cloud storage in azienda: perche` Riak ci e` piaciutoCloud storage in azienda: perche` Riak ci e` piaciuto
Cloud storage in azienda: perche` Riak ci e` piaciuto
 
BioDec LinuxDay2012 Erlug
BioDec LinuxDay2012 ErlugBioDec LinuxDay2012 Erlug
BioDec LinuxDay2012 Erlug
 
Haplone In 5min
Haplone In 5minHaplone In 5min
Haplone In 5min
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company Profile
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Monitoring as Software Validation

  • 1. "Monitoring as Software Validation" Measure anything, Measure everything Serena Lorenzini serena@biodec.com Incontro DevOps Italia Bologna, 21 Feb. 2014
  • 2. Monitoring: If it moves... you can track it! Monitor everything Network Machine Why? ●Learn from your infrastructure ●Anticipate failure ●Speed up changes Application
  • 3. Metrics and Events Metric: Time + Name + Value Event: Time + Name It can be anything
  • 5. Graphite An all-in-one solution for storing and visualizing real-time time-series data Key features: Efficient storage and ultra-fast retrieval. Easy!! http://graphite.wikidot.com/
  • 6. Graphite components Graphite Web Carbon Whisper The front-end of Graphite. It provides a dashboard for retrieval and visualization of our metrics and a powerful plotting API. The core of Graphite. Carbon listens for data in a format, aggregate it and try to store it on disk as quickly as possible using whisper. The data storage. An efficient time series based database.
  • 7. Organization of your data Everything in Graphite has a path with components delimited by dots. servers.hostname.metric applications.appname.metric Paths reflect the organization of the data:
  • 8. Pushing in your data: Carbon configuration (and limitations) Carbon listens for data (1) and aggregates them (2). One can set the two specific behaviors by changing appropriate variables in the configuration files. 1) How often your data will be collected? It needs to have the retention time set to a specific value. For a timespan X I want to store my data at intervals of y (seconds/hours/days/months). What happens if I send two metrics at the same time? Carbon retains only the last one! 2)How do your metrics aggregate? It needs specific keywords to apply functions to aggregate the data (e.g., “min”, “max”, “sum”..).
  • 9. Fast and flexible monitoring: StatsD StatsD Front-end application for Graphite (by Etsy) Buffers metrics locally Aggregates the data for us Flushes periodically data to Graphite Client libraries available in any language Send any metric you like import statsd HOST = 'hostname.server.com' PORT = 8181 PREFIX = 'myprefix' def initialize_client(host, port, prefix): client = statsd.StatsClient(host, port, prefix) return client def send_data(data_name, value, client): client.gauge(data_name, value) client = initialize_client(HOST, PORT, PREFIX) …..CODE..... send_data('Energy', 1000, client) https://github.com/etsy/statsd/
  • 10. Data Types in StatsD Graphite usually stores the most recent data in 1-minute averaged timestep, so when you’re looking at a graph, for each stat you are typically seeing the average value over that minute. Type Counters Timers Gauges Definition Per-second rates Event duration Values Sets Unique values passed to a key Example Page views Page latency How many views do you have Number of registered users accessing your website
  • 11. Fast and flexible monitoring: CollectD CollectD A unix daemon that gathers system statistics Plugin to send metrics to Carbon Very useful for system metrics Application-level statistics: StatsD System-level statistics: CollectD e.g. The number of times a function is called e.g. the memory usage We can combine them in a dashboard!
  • 12. Case study: “Company A” A project not testing friendly ... ...The Design phase was almost skipped! We were asked to translate an existing (Matlab!) application (into Python) Metrics Driven Development!
  • 13. Case study: “Company A” Task: exploring a space of solutions to find the best one Method: Simulated annealing Probability Random Number Metrics Driven Development! Track the evolution of the process instead of parsing a (boring) log file to (1) correlate the consequences of having P(x) > random number and (2) visually inspect the real-time changing of P(x) values during the simulation
  • 14. Case study: “Company B” A project where multiple applications have to interact in order to manage the elaboration of a huge number of pictures every day
  • 15. Case study: “Company B” Monitor to … 1) see the asynchronous activation of the applications 2) gather a regular pattern 3) CHECK FOR CHANGES IN THAT PATTERN! Monitor your system (cpu, ram...) and applications together to see if the hardware suits their requirements or not
  • 16. Case study: “Company B” Monitor your system (cpu,ram...) and applications together to see if the hardware suits their requirements or not. E.g. picture upload time Vs packet received/transmitted Vs memory free/used and so on...
  • 17. Case study: “Company B” Database queries per second? Async tasks currently in queue? How is the application behaving? Images resized and stored? Error and warning rates?
  • 18. Case study: “Company B” These applications are running on several hosts and their metrics end to the same point. You can monitor many different servers by looking at the same dashboard.
  • 19. Testing and Monitoring "measure twice, cut once"- "Cut it quickly in several pieces and see which fits best (now!)” You can do both! Testing: just once during the development Monitoring: it keeps working once the application is released
  • 20. Testing and Monitoring Tests are logical properties of our application. Metrics are not. But Metrics offer you the possibility to see what is going on once the application/system is in production inevitable Failure is not accepted and detectable!
  • 22. Wait... I don't like Graphite Web Interface! No problem! The world of the interfaces is In continuous evolution About 56,100 results
  • 23. You can't optimize what you can't measure so monitor and... Optimize anything, Optimize everything
  • 24. Thank you for your attention! Serena Lorenzini serena@biodec.com Incontro DevOps Italia Bologna, 21 Feb. 2014