Centralized logging system using mongoDB

Centralized Logging System Using MongoDB
@vparihar
AVP Engineering,Webonise Lab
Vivek Parihar

Who Am I?
● A Weboniser and Rubyist
● Blogger(vparihar01.github.com)
● MongoDb user
● Geek
● DevOps
● Mainly write Ruby, but have great passion for Javascript and Cloud
Platforms...

● What is Logging?
● Why we need Logging?
● Logging DO’s and Don’t
● Logs are Streams, Not FIles
● Problems managing Logs for huge INFRA
● What Central Logging System can do for us?
● Central Logging System Architecture
● What and why Fluentd?
● Why MongoDB is good fit.
Agenda

What is Logging?
Mmmm Logging: It is the most important part of any
application.
In General, Logging refers to keeping track of
something.

Logging: Helps me finding and fixing bugs

Logging: Extensively used for Debugging

Logging: Helps us diagnose & understand the
behaviour of application.

Logging: Tells us exactly what happened
when, where and why?
Who did it ?
At what time ?
What did he steal ?

Logging: Do’s and Don’t
#1 It should be FAST

#2 Should not affect user
Prevent DISK BLOAT

It should not be like-:
{
● "#########its working#########"
● "!!!!!coming here in to get secondary users!!!!!"
● "#########I am Here#########"
● "#########Task completed#######"
}
#3 Do Log only useful INFO

4. Differentiate Log Levels

Logs Are Streams, Not Files
Logs are a stream, and it behooves everyone to treat them as such. Your
programs should log to stdout and/or stderr and omit any attempt to handle
log paths, log rotation, or sending logs over the syslog protocol.
Directing where the program’s log stream goes can be left up to the runtime
container: a local terminal or IDE (in development environments), an Upstart
/ Systemd launch script (in traditional hosting environments), or a system
like Logplex/Heroku (in a platform environment).
By: Adam Wiggins, Heroku co-founder.

Problems managing Logs for huge Infra

What about infra like these ?
Problems managing Logs for huge Infra

How can we solve huge Infra problem ?

Solution: Centralized Logging System

What Centralized Logging System can do
for us?

What Centralized Logging System can do for
us?
All of the logs are in one place, this makes things like searching
through logs and analysis across multiple servers easier than
bouncing around between boxes. Greatly simplifying log analysis
and correlation tasks.
#1 Log Collections

#2 Aggregation
Scaled-out servers behind load balancers each produce their
own log files, making it impossible to debug a single action flow
that distributed between servers, unless the logs converge into
a single article.
us?

#3 High Availability
Suppose your system is down or overloaded and unable to tell
you what happened.
us?

Local logs from the server may be lost in the event of an
intrusion or system failure. But by having the logs elsewhere
you at least have a chance of finding something useful about
what happened.
#4 Security
us?

It reduces disk space usage and disk I/O on core servers
that should be busy doing something else.
#5 Prevent Disk BLOAT
us?

#6 Visual Indicators
Abnormal behaviors can be detected faster when we see
them in a visual instrument such as a graph, where peak
points are easily noticed.
us?

Centralized Logging System Architecture

What’s Fluentd?
It’s like syslogd, but uses JSON for log messages

What’s Fluentd?
time
tag
record

What’s Fluentd?
Plug-in Plug-in Plug-in

So Fluentd is a:
Buffer
Router
Collector
Converter
Aggregator
…….
What’s Fluentd?

It’s written in RUBY :)
Why Fluentd?

Extensibility - Plugin Architecture
Why Fluentd?

Unified log format - JSON format
Why Fluentd?

Reliable - HA configuration
Why Fluentd?

Easy to install - RPM/deb packages
> sudo fluentd --setup && fluentd
Very small footprint
> small engine (3,000 line) + plugins
Why Fluentd?

1. It’s Schemaless
Document-oriented / JSON is
a great format for log
information. Very flexible and
“schemaless” in the sense we
can throw in an extra field
any time we want.
Why ?

2. Fire and Forget
MongoDB inserts can be done asynchronously.
Why ?

3. Scalable and easy to replicate.
Built in ReplicaSet and Sharding provides high availability.
Why ?

4. Centralized and easy remote access
Why ?

5. Capped Collection
● They "remember" the insertion order of their documents
● They store inserted documents in the insertion order on disk
● They remove the oldest documents in the collection automatically as new
documents are inserted
However, you give up some things with capped collections:
● They have a fixed maximum size
● You cannot shared a capped collection
● Any updates to documents in a capped collection must not cause a document to
grow. (i.e. not all$set operations will work, and no $push or $pushAll will)
● You may not explicitly .remove() documents from a capped collection
Why ?

6. Tailing Logs
● You’ll really miss ability to tail logfiles
● Or, .. will you?
● MongoDB offers tailable cursors
Why ?

Tailable Cursors
What with Tailable Cursors ?
We can implement the pub/sub using
Node.js and MongoDB
https://github.com/scttnlsn/mubsub
Why ?

Thanks
Would Love to answer your queries...
Vivek Parihar
@vparihar

Centralized logging system using mongoDB

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Centralized logging system using mongoDB

Similar a Centralized logging system using mongoDB (20)

Más de Vivek Parihar

Más de Vivek Parihar (11)

Último

Último (20)

Centralized logging system using mongoDB