Npm has modules for devops, like logging, metrics, service discovery. But when you arrive to production, you may find that these are already handled by old players. Avoid the same mistakes I did, when my first node app was on its way to the world.
12. Node best practices
• put your metrics on an accessible endpoint
/metrics
/status
• there are node libs to automate this
instrument http
• let the metrics tool do scraping, delivery
• watch those nice graphs ☺
check out grafana
13. Key metrics
• latency
check for slow queries,
create performance tests on them
iterate code, re-test again
do not average, use a histogram
• resource usage
slow memory leaks
disk is getting full
predict resource shortage via trends
latency
16. Catch errors as fast as possible!
• instant alert of production errors
• use while feature testing
• keep an eye on it during releases
• aggregate errors in a single service, see all
• catch before the user
17. Ideal error reports have
• environment of error
build / release / branch / server
• stack trace
exact code location
• custom data
anything that helps identifying the problem
18. Error log delivery
• can happen any time,
hopefully rare
• push data
• expect the unexpected,
handle the unhandled
• never log secrets
• sampling, throttling, timeout
do not let error logging itself
kill your app
25. Log levels, recap
• fatal - needs instant intervention, see error logs
• error - inform the user, see error logs
• warn - escalate if happens again
• info - just a step in a regular flow
• debug - full of lines, and traces
26. Benefits of logging, custom logs
• debug
• custom events
• tracking the usage and behaviour of app
• profile, AB test, product development
30. Best practices
• just put it to stdout
(docker & kubernetes clearly ecourages this)
• let the log collector handle it
• pipe stdout to a file, or whatever you like
• able to set to debug mode runtime
use signals
• never log secrets
38. Benefits
• ACL, policies
access set of secrets by
revokeable tokens
• centralized key rotation
edit, update all secrets
at one place
• single use access,
n-use access
• time bound keys
• audit log
• runtime access
no secrets stored on
disk
• build-time access
40. build
server
app
server
Secret Store
Build time Run time
Version Control
secret/name secret/name
secrets built in the
deployed code
secrets were requested
on app startup, stored only
in memory
- token
- secret/name
- actual secrets
- token
- secret/name
- actual secrets
41. Secret store server
• powerful encryption
• has to be unlocked on start
• secrets are totally inaccessible without
unlocking
45. Service discovery can help
• Service Registration
and notify other services of the registered one
• Service Discovery
searching for services?
• Monitoring
is a service active and responding?
• Load Balancing
direct traffic to the new service
46. How it works
• can act like a DNS
simple usecase
internal network
• can write / create configs
more complex
more control
47. How it works
APP
SD AGENT
check PORT
check PID
LBStart
scraping
metrics
Loadbalancer
directs
traffic
Service registry
48. Service discovery agent
• separate task, job, process
• can be configured what to check
• independent of your app
49. Service discovery services
• Apache Zookeeper
• Netflix Eureka
• HashiCorp Consul
• Doozer
• Etcd (can be used to build service discovery)
52. Process supervision
• keeping your app working
• based on some property you define
not just process id, but
port
ping
http response
• can fail after trying
55. Pro Con
Using
Monit
Not
using
Monit
monit can instantly
restart your failing
service
you might not know
why it was failing
MTTR* can be
relatively high
you can debug
what actually
happened
*Mean Time To Repair
57. Simple role
• start & stop your app
watch the process itself
handle process state
• send signals to the app
signals can be interpreted as tasks
58. Running Programs in general
• runit
• upstart
• systemd
• Supervisord
• God
• Circus
59. A good program runner
• distribution independent
you can migrate your scripts any time
• easy to config
60. monit + runit (or similar)
• avoid using auto restart in both
can create weird race conditions,
they do not know about each other
• use runit to configure app start/stop
• let monit decide when to restart & use runit