SlideShare una empresa de Scribd logo
1 de 78
Descargar para leer sin conexión
Have you been stalking
your servers?
Have you been stalking
your servers?
Marji Cermak
Sysadmin & DevOps Engineer at Morpht
marji@morpht.com
@cermakm
The rule of 3 things
picture: http://www.flickr.com/photos/helenaperezgarcia/5692392667/
The rule of 3 things
1. What is monitoring and why do you want to
monitor
2. Some monitoring tools available for you
3. It is easy to start with monitoring.
Part 1
What is monitoring and why do you want to
monitor
photo: http://www.flickr.com/photos/tiagopadua/7903366470/
Monitoring
Monitoring is an intermittent (regular or
irregular) series of observations in time,
carried out to show the extent of compliance
with a formulated standard or degree of
deviation from an expected norm.
J. M. Hellawell (1991), modified by A. Brown
(2000), http://jncc.defra.gov.uk/page-2268
nature conservation area
Why you need to monitor
● to know about the bad news before your
customers (or your boss)
Why you need to monitor
● to know about the bad news before your
customers (or your boss)
● to scale up your server in advance
Why you need to monitor
● to know about the bad news before your
customers (or your boss)
● to scale up your server in advance
● to tune up your app
Why you need to monitor (cont.)
● to prove your uptime of 99.999 :)
The fun of the nines
Source: http://en.wikipedia.org/wiki/High_availability
Nines: http://en.wikipedia.org/wiki/List_of_unusual_units_of_measurement#Nines
Why you need to monitor (cont.)
● to prove your uptime of 99.999 :)
● to minimise downtime (expensive)
Why you need to monitor (cont.)
● to prove your uptime of 99.999 :)
● to minimise downtime (expensive)
● to capture customer information
Why you need to monitor (cont.)
● to have data / metrics to diagnose
Diagnosing your collected data
watch out for:
● trends
Diagnosing your collected data
watch out for:
● trends
● spikes
Diagnosing your collected data
watch out for:
● trends
● spikes
● irregularities
Diagnosing your collected data
watch out for:
● trends
● spikes
● irregularities
● thresholds
Areas to monitor
● network
photo: http://www.flickr.com/photos/misja_klimov/2120956405/
Areas to monitor
● network
● server
photo: http://www.flickr.com/photos/johnjack/3666997634/
Areas to monitor
● network
● server
● services
photo: http://www.flickr.com/photos/agustingodet/3691794089/
Areas to monitor
● network
● server
● services
photo: http://www.flickr.com/photos/agustingodet/3691792393/
Areas to monitor
● network
● server
● services
● applications
photo: http://www.flickr.com/photos/cheerfulstoic/942211994/
Areas to monitor
● network
● server
● services
● applications
● users
photo: http://www.flickr.com/photos/jimmysmith/99528596/
Drupal Areas to monitor?
● network
● server
● services
● applications
● users
Drupal Areas to monitor
● network
● server
● services
● applications
● users
Drupal Areas to monitor
● network
● server
● services
● applications
● users
Drupal Areas to monitor
● network
● server
● services
○ webserver
○ database
● applications
● users
Drupal Areas to monitor
● network
● server
● services
○ webserver
○ database
● applications - your Drupal site(s)
● users
Drupal Areas to monitor
● network
● server
● services
○ webserver
○ database
● applications - your Drupal site(s)
● users
Part 2
Some monitoring tools available for you
Meet Nagios, Munin and others
● Nagios
● Munin
● APC dashboard
● related Drupal modules
Nagios /ˈnɑːɡiːoʊs/
● system, network and infrastructure
monitoring software application
● monitors and alerts
● many plugins
Nagios /ˈnɑːɡiːoʊs/
Name and Pronunciation:
● NetSaint -> "Nagios Ain't Gonna Insist On
Sainthood"
● Agios' a transliteration of the Greek word
άγιος (saint)
Nagios /ˈnɑːɡiːoʊs/
● alerts by email/pager/IM...
● alerts to different contacts
● notification escalation
● service / host dependencies
● soft / hard states
Nagios /ˈnɑːɡiːoʊs/
Drupal and Nagios
Munin
● network/system monitoring application
● outputs graphs through a web interface
● many plugins
Munin
● master / node architecture
● connects to all nodes at regular intervals
● it uses the RRDtool (round robin database
tool, handles time-series data)
Munin Example
Drupal and Munin
Drupal and Munin
● they complement each other
● nagios normally alerts on one “service”
● munin can be used to correlate different
things
Nagios & Munin
APC - what is it?
The Alternative PHP Cache (APC) is a free
and open opcode cache for PHP.
APC - what is it?
The Alternative PHP Cache (APC) is a free
and open opcode cache for PHP.
Its goal is to provide a free, open, and robust
framework for caching and optimising PHP
intermediate code.
Inside your webserver (not a webcache)
Monitoring APC
Memory Usage, Hit & Misses
Monitoring APC
Fragmentation
Monitoring APC
memory usage
Monitoring APC
files in cache
Other monitoring tools
● Collectd
● Graphite
● Shinken
● Sensu
● NewRelic
● Pingdom
Part 3
It is easy to start with monitoring.
How to install these tools?
Munin
sudo apt-get install munin munin-node
Nagios
sudo apt-get install nagios3
APC dashboard
php.apc script from php-apc package
How to configure these?
● It is a bit fiddly
● There are many guides targeting beginners
● You don’t want to do it again and again
puppet – a quick way to start
system for automating system administration
tasks
puppet – a quick way to start
● a declarative language for expressing
system configuration,
puppet – a quick way to start
● a declarative language for expressing
system configuration,
● a client and server for distributing it
puppet – a quick way to start
● a declarative language for expressing
system configuration,
● a client and server for distributing it
● and a library for realising the configuration.
puppet – a quick way to start
package { 'munin-node': ensure => installed }
service { 'munin-node':
enable => true,
ensure => running,
require => Package['munin-node'],
}
puppet – a quick way to start
1. clone the stalk-your-box repo
2. run puppet apply on the code
3. monitor!
A quick way to start
$ git clone
git://github.com/morpht/stalk-your-box.git
/tmp/stalk-your-box
Cloning into '/tmp/stalk-your-box'...
remote: Counting objects: 23, done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 23 (delta 1), reused 23 (delta 1)
Receiving objects: 100% (23/23), 11.35 KiB, done.
Resolving deltas: 100% (1/1), done.
A quick way to start
$ cd /tmp/stalk-your-box/
$ sudo puppet apply
--modulepath=modules manifest.pp
notice: /Stage[main]/Nagios::Server/Package[nagios3]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Nagios::Server/File[/etc/nagios3/htpasswd.users]/ensure: created
notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: Adding password for user nagiosadmin
notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: executed successfully
notice: /Stage[main]/Munin::Node/Package[libcache-cache-perl]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Munin::Node/Package[munin-node]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Munin::Node/File[munin-node.conf]/content: content changed '{md5}
e486786f866d7d7e025dea401c300e7b' to '{md5}dbf97a87a8da86ef68155815ecae3c1c'
notice: /Stage[main]/Munin::Server/Service[apache2]: Triggered 'refresh' from 1 events
notice: Finished catalog run in 44.26 seconds
What this gives you
What this gives you
What this gives you
Manifest.pp
# Execute apt-get update before any package is installed:
exec { 'apt-update':
command => 'apt-get update',
# but don't execute it more than once a day:
unless => 'test $(find /var/cache/apt/pkgcache.bin -mtime 0 | wc -l ) -eq 1',
}
Exec['apt-update'] -> Package <| |>
# Include minimal apache2 installation. Munin server, nagios
# and APC dashboard depend on it.
include 'apache2'
Manifest.pp
# Install munin node and munin server:
class { 'munin::node': }
class { 'munin::server':
htuser => 'munin', # Username for basic access auth.
htpass => 'Prague2013' # Password for basic access auth.
}
# Install nagios:
class { 'nagios::server':
contact_email => 'root@localhost', # Email to send alerts to.
htpass => 'Prague2013', # Password for the nagiosadmin username.
}
Manifest.pp
# Deploys APC dashboard - install php-apc package and
# deploy the apc.php script from it.
package { 'php-apc': ensure => installed }
exec { 'deploy-apc-dashboard':
path => '/bin:/usr/bin',
command => 'gzip -dc /usr/share/doc/php-apc/apc.php.gz > /var/www/apc.php',
notify => Service['apache2'],
unless => '[ -f /var/www/apc.php ]',
require => [ Package['php-apc'], Package['apache2'] ]
}
Summary
It is easy to start with monitoring.
The fun part - what’s wrong?
What’s wrong here?
The fun part - what’s wrong?
Questions
Here is the get started monitoring repo:
https://github.com/morpht/stalk-your-box
Marji Cermak
Sysadmin & DevOps Engineer at Morpht
marji@morpht.com
@cermakm
Resources
Rule of Three: en.wikipedia.org/wiki/Rule_of_three_(writing)
Nagios: http://www.nagios.org/
Munin: http://munin-monitoring.org/
Nagios module: https://drupal.org/project/nagios
Munin module: https://drupal.org/project/munin
Munin plugins (experimental): https://drupal.org/sandbox/murrayw/2084281
Sensu: http://sensuapp.org
MySQLTuner: http://MySQLTuner.pl
THANK YOU!
WHAT DID YOU THINK?
Locate this session at the
DrupalCon Prague website:
http://prague2013.drupal.org/schedule
Click the “Take the survey” link

Más contenido relacionado

Similar a Have you been stalking your servers?

Have you been stalking your servers?
Have you been stalking your servers?Have you been stalking your servers?
Have you been stalking your servers?morpht
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Opersys inc.
 
Vietnam qa meetup
Vietnam qa meetupVietnam qa meetup
Vietnam qa meetupSyam Sasi
 
On-Demand Image Resizing Extended - External Meet-up
On-Demand Image Resizing Extended - External Meet-upOn-Demand Image Resizing Extended - External Meet-up
On-Demand Image Resizing Extended - External Meet-upJonathan Lee
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsGraham Dumpleton
 
Advanced deployment scenarios (netcoreconf)
Advanced deployment scenarios (netcoreconf)Advanced deployment scenarios (netcoreconf)
Advanced deployment scenarios (netcoreconf)Sergio Navarro Pino
 
On-Demand Image Resizing
On-Demand Image ResizingOn-Demand Image Resizing
On-Demand Image ResizingJonathan Lee
 
EuroPython 2013 - Python3 TurboGears Training
EuroPython 2013 - Python3 TurboGears TrainingEuroPython 2013 - Python3 TurboGears Training
EuroPython 2013 - Python3 TurboGears TrainingAlessandro Molina
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET Journal
 
Git Basics Workshop Summer of Tech 2010
Git Basics Workshop Summer of Tech 2010Git Basics Workshop Summer of Tech 2010
Git Basics Workshop Summer of Tech 2010Y. Thong Kuah
 
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...YuChianWu
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingZendCon
 
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...Docker, Inc.
 
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 PerformanceTYPO3 CertiFUNcation
 
TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)Marcus Schwemer
 
CIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMCIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMICF CIRCUIT
 

Similar a Have you been stalking your servers? (20)

Have you been stalking your servers?
Have you been stalking your servers?Have you been stalking your servers?
Have you been stalking your servers?
 
September SDG - Lightning
September SDG - LightningSeptember SDG - Lightning
September SDG - Lightning
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
 
Vietnam qa meetup
Vietnam qa meetupVietnam qa meetup
Vietnam qa meetup
 
On-Demand Image Resizing Extended - External Meet-up
On-Demand Image Resizing Extended - External Meet-upOn-Demand Image Resizing Extended - External Meet-up
On-Demand Image Resizing Extended - External Meet-up
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Advanced deployment scenarios (netcoreconf)
Advanced deployment scenarios (netcoreconf)Advanced deployment scenarios (netcoreconf)
Advanced deployment scenarios (netcoreconf)
 
On-Demand Image Resizing
On-Demand Image ResizingOn-Demand Image Resizing
On-Demand Image Resizing
 
EuroPython 2013 - Python3 TurboGears Training
EuroPython 2013 - Python3 TurboGears TrainingEuroPython 2013 - Python3 TurboGears Training
EuroPython 2013 - Python3 TurboGears Training
 
Optimizing Your CI Pipelines
Optimizing Your CI PipelinesOptimizing Your CI Pipelines
Optimizing Your CI Pipelines
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
 
Git Basics Workshop Summer of Tech 2010
Git Basics Workshop Summer of Tech 2010Git Basics Workshop Summer of Tech 2010
Git Basics Workshop Summer of Tech 2010
 
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...
Im-A-Hacker-Get-Me-Out-Of-Here-Breaking-Network-Segregation-Using-Esoteric-Co...
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server Tracing
 
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
On-Demand Image Resizing from Part of the monolith to Containerized Microserv...
 
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
2018 - CertiFUNcation - Marcus Schwemer: TYPO3 Performance
 
TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)TYPO3 Performance (T3DD18)
TYPO3 Performance (T3DD18)
 
CIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMCIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEM
 
Advanced deployment scenarios
Advanced deployment scenariosAdvanced deployment scenarios
Advanced deployment scenarios
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Último (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Have you been stalking your servers?

  • 1. Have you been stalking your servers?
  • 2. Have you been stalking your servers? Marji Cermak Sysadmin & DevOps Engineer at Morpht marji@morpht.com @cermakm
  • 3. The rule of 3 things picture: http://www.flickr.com/photos/helenaperezgarcia/5692392667/
  • 4. The rule of 3 things 1. What is monitoring and why do you want to monitor 2. Some monitoring tools available for you 3. It is easy to start with monitoring.
  • 5. Part 1 What is monitoring and why do you want to monitor
  • 7. Monitoring Monitoring is an intermittent (regular or irregular) series of observations in time, carried out to show the extent of compliance with a formulated standard or degree of deviation from an expected norm. J. M. Hellawell (1991), modified by A. Brown (2000), http://jncc.defra.gov.uk/page-2268 nature conservation area
  • 8. Why you need to monitor ● to know about the bad news before your customers (or your boss)
  • 9. Why you need to monitor ● to know about the bad news before your customers (or your boss) ● to scale up your server in advance
  • 10. Why you need to monitor ● to know about the bad news before your customers (or your boss) ● to scale up your server in advance ● to tune up your app
  • 11. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :)
  • 12. The fun of the nines Source: http://en.wikipedia.org/wiki/High_availability Nines: http://en.wikipedia.org/wiki/List_of_unusual_units_of_measurement#Nines
  • 13. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :) ● to minimise downtime (expensive)
  • 14. Why you need to monitor (cont.) ● to prove your uptime of 99.999 :) ● to minimise downtime (expensive) ● to capture customer information
  • 15. Why you need to monitor (cont.) ● to have data / metrics to diagnose
  • 16. Diagnosing your collected data watch out for: ● trends
  • 17. Diagnosing your collected data watch out for: ● trends ● spikes
  • 18. Diagnosing your collected data watch out for: ● trends ● spikes ● irregularities
  • 19. Diagnosing your collected data watch out for: ● trends ● spikes ● irregularities ● thresholds
  • 20. Areas to monitor ● network photo: http://www.flickr.com/photos/misja_klimov/2120956405/
  • 21. Areas to monitor ● network ● server photo: http://www.flickr.com/photos/johnjack/3666997634/
  • 22. Areas to monitor ● network ● server ● services photo: http://www.flickr.com/photos/agustingodet/3691794089/
  • 23. Areas to monitor ● network ● server ● services photo: http://www.flickr.com/photos/agustingodet/3691792393/
  • 24. Areas to monitor ● network ● server ● services ● applications photo: http://www.flickr.com/photos/cheerfulstoic/942211994/
  • 25. Areas to monitor ● network ● server ● services ● applications ● users photo: http://www.flickr.com/photos/jimmysmith/99528596/
  • 26. Drupal Areas to monitor? ● network ● server ● services ● applications ● users
  • 27. Drupal Areas to monitor ● network ● server ● services ● applications ● users
  • 28. Drupal Areas to monitor ● network ● server ● services ● applications ● users
  • 29. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications ● users
  • 30. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications - your Drupal site(s) ● users
  • 31. Drupal Areas to monitor ● network ● server ● services ○ webserver ○ database ● applications - your Drupal site(s) ● users
  • 32. Part 2 Some monitoring tools available for you
  • 33. Meet Nagios, Munin and others ● Nagios ● Munin ● APC dashboard ● related Drupal modules
  • 34.
  • 35. Nagios /ˈnɑːɡiːoʊs/ ● system, network and infrastructure monitoring software application ● monitors and alerts ● many plugins
  • 36. Nagios /ˈnɑːɡiːoʊs/ Name and Pronunciation: ● NetSaint -> "Nagios Ain't Gonna Insist On Sainthood" ● Agios' a transliteration of the Greek word άγιος (saint)
  • 37. Nagios /ˈnɑːɡiːoʊs/ ● alerts by email/pager/IM... ● alerts to different contacts ● notification escalation ● service / host dependencies ● soft / hard states
  • 40.
  • 41. Munin ● network/system monitoring application ● outputs graphs through a web interface ● many plugins
  • 42. Munin ● master / node architecture ● connects to all nodes at regular intervals ● it uses the RRDtool (round robin database tool, handles time-series data)
  • 43.
  • 44.
  • 48. ● they complement each other ● nagios normally alerts on one “service” ● munin can be used to correlate different things Nagios & Munin
  • 49. APC - what is it? The Alternative PHP Cache (APC) is a free and open opcode cache for PHP.
  • 50. APC - what is it? The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. Its goal is to provide a free, open, and robust framework for caching and optimising PHP intermediate code. Inside your webserver (not a webcache)
  • 55. Other monitoring tools ● Collectd ● Graphite ● Shinken ● Sensu ● NewRelic ● Pingdom
  • 56. Part 3 It is easy to start with monitoring.
  • 57. How to install these tools? Munin sudo apt-get install munin munin-node Nagios sudo apt-get install nagios3 APC dashboard php.apc script from php-apc package
  • 58. How to configure these? ● It is a bit fiddly ● There are many guides targeting beginners ● You don’t want to do it again and again
  • 59. puppet – a quick way to start system for automating system administration tasks
  • 60. puppet – a quick way to start ● a declarative language for expressing system configuration,
  • 61. puppet – a quick way to start ● a declarative language for expressing system configuration, ● a client and server for distributing it
  • 62. puppet – a quick way to start ● a declarative language for expressing system configuration, ● a client and server for distributing it ● and a library for realising the configuration.
  • 63. puppet – a quick way to start package { 'munin-node': ensure => installed } service { 'munin-node': enable => true, ensure => running, require => Package['munin-node'], }
  • 64. puppet – a quick way to start 1. clone the stalk-your-box repo 2. run puppet apply on the code 3. monitor!
  • 65. A quick way to start $ git clone git://github.com/morpht/stalk-your-box.git /tmp/stalk-your-box Cloning into '/tmp/stalk-your-box'... remote: Counting objects: 23, done. remote: Compressing objects: 100% (19/19), done. remote: Total 23 (delta 1), reused 23 (delta 1) Receiving objects: 100% (23/23), 11.35 KiB, done. Resolving deltas: 100% (1/1), done.
  • 66. A quick way to start $ cd /tmp/stalk-your-box/ $ sudo puppet apply --modulepath=modules manifest.pp notice: /Stage[main]/Nagios::Server/Package[nagios3]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Nagios::Server/File[/etc/nagios3/htpasswd.users]/ensure: created notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: Adding password for user nagiosadmin notice: /Stage[main]/Nagios::Server/Exec[update-nagios-htpasswd]/returns: executed successfully notice: /Stage[main]/Munin::Node/Package[libcache-cache-perl]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Munin::Node/Package[munin-node]/ensure: ensure changed 'purged' to 'present' notice: /Stage[main]/Munin::Node/File[munin-node.conf]/content: content changed '{md5} e486786f866d7d7e025dea401c300e7b' to '{md5}dbf97a87a8da86ef68155815ecae3c1c' notice: /Stage[main]/Munin::Server/Service[apache2]: Triggered 'refresh' from 1 events notice: Finished catalog run in 44.26 seconds
  • 70. Manifest.pp # Execute apt-get update before any package is installed: exec { 'apt-update': command => 'apt-get update', # but don't execute it more than once a day: unless => 'test $(find /var/cache/apt/pkgcache.bin -mtime 0 | wc -l ) -eq 1', } Exec['apt-update'] -> Package <| |> # Include minimal apache2 installation. Munin server, nagios # and APC dashboard depend on it. include 'apache2'
  • 71. Manifest.pp # Install munin node and munin server: class { 'munin::node': } class { 'munin::server': htuser => 'munin', # Username for basic access auth. htpass => 'Prague2013' # Password for basic access auth. } # Install nagios: class { 'nagios::server': contact_email => 'root@localhost', # Email to send alerts to. htpass => 'Prague2013', # Password for the nagiosadmin username. }
  • 72. Manifest.pp # Deploys APC dashboard - install php-apc package and # deploy the apc.php script from it. package { 'php-apc': ensure => installed } exec { 'deploy-apc-dashboard': path => '/bin:/usr/bin', command => 'gzip -dc /usr/share/doc/php-apc/apc.php.gz > /var/www/apc.php', notify => Service['apache2'], unless => '[ -f /var/www/apc.php ]', require => [ Package['php-apc'], Package['apache2'] ] }
  • 73. Summary It is easy to start with monitoring.
  • 74. The fun part - what’s wrong? What’s wrong here?
  • 75. The fun part - what’s wrong?
  • 76. Questions Here is the get started monitoring repo: https://github.com/morpht/stalk-your-box Marji Cermak Sysadmin & DevOps Engineer at Morpht marji@morpht.com @cermakm
  • 77. Resources Rule of Three: en.wikipedia.org/wiki/Rule_of_three_(writing) Nagios: http://www.nagios.org/ Munin: http://munin-monitoring.org/ Nagios module: https://drupal.org/project/nagios Munin module: https://drupal.org/project/munin Munin plugins (experimental): https://drupal.org/sandbox/murrayw/2084281 Sensu: http://sensuapp.org MySQLTuner: http://MySQLTuner.pl
  • 78. THANK YOU! WHAT DID YOU THINK? Locate this session at the DrupalCon Prague website: http://prague2013.drupal.org/schedule Click the “Take the survey” link