Redundant devops

Redundant Devops
about reinventing the wheel

JSConf
Budapest
2017
Curator, Organizer

Metrics
Error Logs
Logging
Secret Store
Service Discovery
Process Supervision
Running Programs
Connecting Services

Metrics are
• time bound, historical
• numeric data
• software, network or hardware property

Metrics are great!
• see trends
• mark releases
• notice anomalies:  
spikes & gaps
• create alerts
!
?

Metric delivery
• collect (scrape) or push data?
• collect periodically
• put metric data where it can
be collected

Tools for metrics
• prometheus
• graphite

Node best practices
• put your metrics on an accessible endpoint 
 
/metrics 
/status
• there are node libs to automate this 
instrument http
• let the metrics tool do scraping, delivery
• watch those nice graphs ☺ 
check out grafana

Key metrics
• latency 
 
check for slow queries,  
create performance tests on them 
iterate code, re-test again 
do not average, use a histogram
• resource usage 
 
slow memory leaks 
disk is getting full 
predict resource shortage via trends
latency

Sending metrics
is not the job of your app

Catch errors as fast as possible!
• instant alert of production errors
• use while feature testing
• keep an eye on it during releases
• aggregate errors in a single service, see all
• catch before the user

Ideal error reports have
• environment of error  
build / release / branch / server
• stack trace 
exact code location
• custom data 
anything that helps identifying the problem

Error log delivery
• can happen any time,  
hopefully rare
• push data
• expect the unexpected, 
handle the unhandled
• never log secrets
• sampling, throttling, timeout 
do not let error logging itself 
kill your app

Tools & services for error reporting
• airbrake
• errbit (airbrake api, open source)
• sentry
• raygun
• rollbar
• …

Integrate, get notiﬁed!
• pagerduty
• slack / hipchat 
chatops - resolve, react within your chat

Logging vs Error logging
• logging is anticipated
• error logs are occasional

Log levels, recap
• fatal - needs instant intervention, see error logs
• error - inform the user, see error logs
• warn - escalate if happens again
• info - just a step in a regular ﬂow
• debug - full of lines, and traces

Beneﬁts of logging, custom logs
• debug
• custom events
• tracking the usage and behaviour of app
• proﬁle, AB test, product development

Logging in node
• console.log
• bragi
• debug
• npmlog
• winston

Logging in node - general
• has timestamps
• has loglevels
• can be routed to stdout/stderr
• can be formatted
• create or use Correlation ID

Correlation ID quick quide
cID
cID
cID
cID
cID
cID
cID
cID
cID
services
logs

Best practices
• just put it to stdout 
(docker & kubernetes clearly ecourages this)
• let the log collector handle it
• pipe stdout to a ﬁle, or whatever you like
• able to set to debug mode runtime 
use signals
• never log secrets

Log collectors
• ﬂuentd
• logstash
• syslog-ng
• rsyslog

A good log collector should
• read from stdout / ﬁle tail
• use your correlation ID
• remove the burden of transferring your logs

Remote logging
• Stackdriver (ﬂuentd based)
• Elasticsearch (ﬂuentd based)

Sending logs

Secrets
• passwords / usernames
• db names
• API keys
• private keys

NOT Secret Storage
× source code
× private VCS repositories
× config files
× simple database fields
× ENV variables

Beneﬁts
• ACL, policies 
access set of secrets by
revokeable tokens
• centralized key rotation 
edit, update all secrets
at one place
• single use access,  
n-use access
• time bound keys
• audit log
• runtime access 
no secrets stored on
disk
• build-time access

build 
server
app
server
Secret Store
Build time Run time
Version Control
secret/name secret/name
secrets built in the  
deployed code
secrets were requested 
on app startup, stored only 
in memory
- token
- secret/name
- actual secrets
- token
- secret/name
- actual secrets

Secret store server
• powerful encryption
• has to be unlocked on start
• secrets are totally inaccessible without
unlocking

Secret store services
• HashiCorp Vault
• Amazon KMS
• Docker Swarm
• Keywhiz

Never store your secrets in your
source code

Service discovery can help
• Service Registration 
and notify other services of the registered one
• Service Discovery 
searching for services?
• Monitoring 
is a service active and responding?
• Load Balancing 
direct trafﬁc to the new service

How it works
• can act like a DNS  
simple usecase 
internal network
• can write / create conﬁgs 
more complex 
more control

How it works
APP
SD AGENT
check PORT
check PID
LBStart
scraping
metrics
Loadbalancer
directs 
trafﬁc
Service registry

Service discovery agent
• separate task, job, process
• can be conﬁgured what to check
• independent of your app

Service discovery services
• Apache Zookeeper
• Netﬂix Eureka
• HashiCorp Consul
• Doozer
• Etcd (can be used to build service discovery)

Registering services  

Process supervision
• keeping your app working
• based on some property you deﬁne 
not just process id, but 
port 
ping 
http response
• can fail after trying

Process supervision  
in Node-land
• PM2
• forever

Process supervision in general
• monit 
manage any process 
small footprint 
simple

Pro Con
Using 
Monit
Not  
using 
Monit
monit can instantly
restart your failing
service
you might not know
why it was failing
MTTR* can be
relatively high
you can debug
what actually
happened
*Mean Time To Repair

Simple role
• start & stop your app 
watch the process itself 
handle process state
• send signals to the app 
signals can be interpreted as tasks

Running Programs in general
• runit
• upstart
• systemd
• Supervisord
• God
• Circus

A good program runner
• distribution independent 
you can migrate your scripts any time
• easy to conﬁg

monit + runit (or similar)
• avoid using auto restart in both 
can create weird race conditions, 
they do not know about each other
• use runit to conﬁgure app start/stop
• let monit decide when to restart & use runit

Goals & beneﬁts
• decoupling 
separate services 
loosen up the connection between them
• scaling 
scale up easily when needed 
scale down after

HTTP based APIs  
 
vs  
 
Message Queues

HTTP based APIs
LOADBALANCER
Service “1” Service “2”

Message Queues
Service “1” Service “2”
MESSAGE QUEUE

HTTP based APIs  
or  
Message Queues?
It depends

HTTP APIs
• async / sync
• remote
• open API
Msg Queues
• async (usually)
• grouped, close
• low latency

Prototype & learn
• use whatever modules and services
you like
• get ready to go to live & production
environments
• get ready to scale easily

Focus your app
• your app should do it’s job!
• not sending logs, metrics, notifying
service registries or keeping itself
running
• keep it simple

Talk to your ops
• they are here to run your app
• can help you a lot
• get on a common ground
• ask the right questions

With many thanks to
Peter Wilcsinszky / @pepov
Ferenc Kovacs / @Tyr43l

Let’s talk! :) 
 
Find me around here,  
or come visit us in 2 weeks!
JSConf
Budapest
2017

Redundant devops

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Redundant devops

Similar a Redundant devops (20)

Más de Szabolcs Szabolcsi-Tóth

Más de Szabolcs Szabolcsi-Tóth (6)

Último

Último (20)

Redundant devops