SlideShare a Scribd company logo
1 of 41
Download to read offline
Distributed Logging Architecture
in Container Era
LinuxCon Japan 2016 at Jun 13 2016
Satoshi "Moris" Tagomori (@tagomoris)
Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
http://www.linuxfoundation.org/news-media/announcements/2016/06/chaosuan-crunchy-data-qbox-storageos-and-treasure-data-join-cloud
Topics
• Microservices and logging in various industries
• Difficulties of logging with containers
• Distributed logging architecture
• Patterns of distributed logging architecture
• Case Study: Docker and Fluentd
Logging
Logging in Various Industries
• Web access logs
• Views/visitors on media
• Views/clicks on Ads
• Commercial transactions (EC, Game, ...)
• Data from devices
• Operation logs on Apps of phones
• Various sensor data
Microservices and Logging
• Monolithic service
• a service produces all data
about an user's behavior
• Microservices
• many services produce data
about an user's access
• it's needed to collect logs
from many services to know
what is happening
Users
Service (Application)
Logs
Users
Logs
Logging and Containers
Containers:
"a must" for microservices
• Dividing a service into services
• a service requires less computing resources

(VM -> containers)
• Making services independent from each other
• but it is very difficult :(
• some dependency must be solved even in
development environment

(containers on desktop)
Redesign Logging: Why?
• No permanent storages
• No fixed physical/network address
• No fixed mapping between servers and roles
• We should parse/label logs at the source, ship
these logs by pushing to destination ASAP
Containers:
immutable & disposable
• No permanent storages
• Where to write logs?
• files in the container

→ gone w/ container instance 😞
• directories shared from hosts

→ hosts are shared by many containers/services
☹
• TODO: ship logs from container to anywhere ASAP
Containers:
unfixed addresses
• No fixed physical / network address
• Where should we go to fetch logs?
• Service discovery (e.g., consul)

→ one more component 😞
• rsync? ssh+tail? or ..? Is it installed in containers?

→ one more tool to depend on ☹
• TODO: push logs to anywhere from containers
Containers:
instances per roles
• No fixed mapping between servers and roles
• How can we parse / store these logs?
• Central repository about log syntax

→ very hard to maintain 😞
• Label logs by source address

→ many containers/roles in a host ☹
• TODO: label & parse logs at source of logs
Distributed Logging
Architecture
Core Architecture
• Collector nodes
• Aggregator nodes
• Destinations
Collector nodes
(Docker containers + agent)
Destinations

(Storage, Database, ...)
Aggregator nodes
• Parse/Label (collector)
• Raw logs are not good for processing
• Convert logs to structured data (key-value pairs)
• Split/Sort (aggregator)
• Mixed logs are not good for searching
• Split whole data stream into streams per services
• Store (destination)
• Format logs(records) as destination expects
Collecting and Storing Data
Scaling Logging
• Network traffic
• CPU load to parse / format
• Parse logs on each collector (distributed)
• Format logs on aggregator (to be distributed)
• Capability
• Make aggregators redundant
• Controlling delay
• to make sure when we can know what's happening in our
systems
Patterns
source aggregation
NO
source aggregation
YES
destination
aggregation
NO
destination
aggregation
YES
Aggregation Patterns
Source Side Aggregation Patterns
w/o source aggregation w/ source aggregation
collector
aggregator
/
destination
aggregate
container
Without Source Aggregation
• Pros:
• Simple configuration
• Cons:
• fixed aggregator (endpoint) address
• many network connections
• high load in aggregator
collector
aggregator
With Source Aggregation
• Pros:
• less connections
• lower load in aggregator
• less configuration in containers

(by specifying localhost)
• highly flexible configuration

(by deployment only of aggregate containers)
• Cons:
• a bit much resource (+1 container per host)
aggregate
container
aggregator
Destination Side Aggregation Patterns
w/o destination aggregation w/ destination aggregation
aggregator
collector
destination
Without Destination Aggregation
• Pros:
• Less nodes
• Simpler configuration
• Cons:
• Storage side change affects collector side
• Worse performance: many small write requests
on storage
With Destination Aggregation
• Pros:
• Collector side configuration is

free from storage side changes
• Better performance with fine tune

on destination side aggregator
• Cons:
• More nodes
• A bit complex configuration
aggregator
Scaling Patterns
Scaling Up Endpoints
HTTP/TCP load balancer
Huge queue + workers
Scaling Out Endpoints
Round-robin clients
Load balancer
Backend nodes
Collector nodes
Aggregator nodes
Scaling Up Endpoints
• Pros:
• Simple configuration

in collector nodes
• Cons:
• Limits about scaling up
Load balancer
Backend nodes
Scaling Out Endpoints
• Pros:
• Unlimited scaling

by adding aggregator nodes
• Cons:
• Complex configuration
• Client features for round-robin
Without

Destination Aggregation
With

Destination Aggregation
Scaling Up
Endpoints
Systems in early stages
Collecting logs over
Internet
or
Using queues
Scaling Out
Endpoints
Impossible :(
Collector nodes must know
all endpoints
↓
Uncontrollable
Collecting logs
in datacenter
Case Studies
Case Study: Docker+Fluentd
• Destination aggregation + scaling up
• Fluent logger + Fluentd
• Source aggregation + scaling up
• Docker json logger + Fluentd + Elasticsearch
• Docker fluentd logger + Fluentd + Kafka
• Source/Destination aggregation + scaling out
• Docker fluentd logger + Fluentd
Why Fluentd?
• Docker Fluentd logging driver
• Docker containers can send logs to Fluentd
directly - less overhead
• Pluggable architecture
• Various destination systems
• Small memory footprint
• Source aggregation requires +1 container per host
• Less additional resource usage ( < 100MB )
Destination aggregation + scaling up
• Sending logs directly over TCP by Fluentd logger
library in application code
• Same with patterns of New Relic
• Easy to implement

- good for startups Application code
Source aggregation + scaling up
• Kubernetes: Json logger + Fluentd + Elasticsearch
• Applications write logs to STDOUT
• Docker writes logs as JSON in files
• Fluentd

reads logs from file

parse JSON objects

writes logs to Elasticsearch
• EFK stack (like ELK stack)
http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/
Elasticsearch
Application code
Files (JSON)
Source aggregation + scaling up/out
• Docker fluentd logging driver + Fluentd + Kafka
• Applications write logs to STDOUT
• Docker sends logs

to localhost Fluentd
• Fluentd

gets logs over TCP

pushes logs into Kafka
• Highly scalable & less overhead

- very good for huge deployment
Kafka
Application code
Application code
Source/Destination aggregation +
scaling out
• Docker fluentd logging driver + Fluentd
• Applications write logs to STDOUT
• Docker sends logs

to localhost Fluentd
• Fluentd

gets logs over TCP

sends logs into Aggregator Fluentd

w/ round-robin load balance
• Highly flexible

- good for complex data processing

requirements
Any other storages
What's the Best?
• Writing logs from containers: Some way to do it
• Docker logging driver
• Write logs on files + read/parse it
• Send logs from apps directly
• Make the platform scalable!
• Source aggregation: Fluentd on localhost
• Scalable storage: (Kafka, external services, ...)
• No destination aggregation + Scaling up
• Non-scalable storage: (Filesystems, RDBMSs, ...)
• Destination aggregation + Scaling out
Why OSS Are Important
For Logging?
Why OSS?
• Logging layer is interface
• transparency
• interoperability
• Keep the platform scalable
• number of nodes
• number of types of source/destination
Use OSS,
Make Logging Scalable
Thank you!

More Related Content

What's hot

Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure DataTaro L. Saito
 
Overview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceOverview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceSATOSHI TAGOMORI
 
How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case Kai Sasaki
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage SystemsSATOSHI TAGOMORI
 
Fluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableFluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableShu Ting Tseng
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015N Masahiro
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178Kai Sasaki
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamSATOSHI TAGOMORI
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoSadayuki Furuhashi
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomyDongmin Yu
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQLSATOSHI TAGOMORI
 
Presto at Twitter
Presto at TwitterPresto at Twitter
Presto at TwitterBill Graham
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Treasure Data, Inc.
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBKai Sasaki
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Martin Traverso
 

What's hot (20)

Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
 
Overview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceOverview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data Service
 
How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Fluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableFluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, Scalable
 
Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015Treasure Data and AWS - Developers.io 2015
Treasure Data and AWS - Developers.io 2015
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for Presto
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
Presto at Twitter
Presto at TwitterPresto at Twitter
Presto at Twitter
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
 
Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -Plazma - Treasure Data’s distributed analytical database -
Plazma - Treasure Data’s distributed analytical database -
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDB
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
 

Viewers also liked

Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldSATOSHI TAGOMORI
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In RubySATOSHI TAGOMORI
 
Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"SATOSHI TAGOMORI
 
20160730 fluentd meetup in matsue slide
20160730 fluentd meetup in matsue slide20160730 fluentd meetup in matsue slide
20160730 fluentd meetup in matsue slidecosmo0920
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05都元ダイスケ Miyamoto
 
Fluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API DetailsFluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API DetailsSATOSHI TAGOMORI
 
Game Over - HTML5 Games
Game Over - HTML5 GamesGame Over - HTML5 Games
Game Over - HTML5 GamesGuido Garcia
 
Say no to var_dump
Say no to var_dumpSay no to var_dump
Say no to var_dumpbenwaine
 
Powerupcloud - Customer Case Studies
Powerupcloud - Customer Case StudiesPowerupcloud - Customer Case Studies
Powerupcloud - Customer Case StudiesJonathyan Maas ☁
 
Yellowstone National Park and Grand Teton, USA
Yellowstone  National Park and Grand Teton, USAYellowstone  National Park and Grand Teton, USA
Yellowstone National Park and Grand Teton, USACherie Ng
 
Building Awesome APIs with Lumen
Building Awesome APIs with LumenBuilding Awesome APIs with Lumen
Building Awesome APIs with LumenKit Brennan
 
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...Elk Software Group
 
Creating a personal narrative
Creating a personal narrativeCreating a personal narrative
Creating a personal narrativeEmily Kissner
 
Service Orchestrierung mit Apache Mesos
Service Orchestrierung mit Apache MesosService Orchestrierung mit Apache Mesos
Service Orchestrierung mit Apache MesosRalf Ernst
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveCloudCamp Chicago
 

Viewers also liked (20)

Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"Fighting API Compatibility On Fluentd Using "Black Magic"
Fighting API Compatibility On Fluentd Using "Black Magic"
 
20160730 fluentd meetup in matsue slide
20160730 fluentd meetup in matsue slide20160730 fluentd meetup in matsue slide
20160730 fluentd meetup in matsue slide
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
AWSにおけるバッチ処理の ベストプラクティス - Developers.IO Meetup 05
 
Fluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API DetailsFluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API Details
 
Game Over - HTML5 Games
Game Over - HTML5 GamesGame Over - HTML5 Games
Game Over - HTML5 Games
 
Say no to var_dump
Say no to var_dumpSay no to var_dump
Say no to var_dump
 
Powerupcloud - Customer Case Studies
Powerupcloud - Customer Case StudiesPowerupcloud - Customer Case Studies
Powerupcloud - Customer Case Studies
 
Yellowstone National Park and Grand Teton, USA
Yellowstone  National Park and Grand Teton, USAYellowstone  National Park and Grand Teton, USA
Yellowstone National Park and Grand Teton, USA
 
Building Awesome APIs with Lumen
Building Awesome APIs with LumenBuilding Awesome APIs with Lumen
Building Awesome APIs with Lumen
 
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...
Finding HMAS Sydney Chapter 5 - Kormoran Database & the Mathematics of Reliab...
 
Creating a personal narrative
Creating a personal narrativeCreating a personal narrative
Creating a personal narrative
 
Progressive tenses
Progressive tensesProgressive tenses
Progressive tenses
 
Service Orchestrierung mit Apache Mesos
Service Orchestrierung mit Apache MesosService Orchestrierung mit Apache Mesos
Service Orchestrierung mit Apache Mesos
 
Introduction to ICS/SCADA security
Introduction to ICS/SCADA securityIntroduction to ICS/SCADA security
Introduction to ICS/SCADA security
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
DevOps at Crevise Technologies
DevOps at Crevise TechnologiesDevOps at Crevise Technologies
DevOps at Crevise Technologies
 

Similar to Distributed Logging Architecture in Container Era

Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL DatabasesEmanuel Calvo
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconN Masahiro
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode
 
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...Frank Lyaruu
 
Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!David Hoerster
 
Gib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorGib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorDaniel Toomey
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
MongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableMongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableFlowable
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Emprovise
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDBMongoDB
 
The Wix Microservice Stack
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice StackTomer Gabel
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsKublr
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleEvan Chan
 

Similar to Distributed Logging Architecture in Container Era (20)

Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at Kubecon
 
Containers and Logging
Containers and LoggingContainers and Logging
Containers and Logging
 
Logging for Containers
Logging for ContainersLogging for Containers
Logging for Containers
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
 
Timesten Architecture
Timesten ArchitectureTimesten Architecture
Timesten Architecture
 
Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!
 
Gib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorGib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk Migrator
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
MongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableMongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with Flowable
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDB
 
The Wix Microservice Stack
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice Stack
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container Operations
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 

More from SATOSHI TAGOMORI

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speedSATOSHI TAGOMORI
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsSATOSHI TAGOMORI
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of RubySATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)SATOSHI TAGOMORI
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script ConfusingSATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubySATOSHI TAGOMORI
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsSATOSHI TAGOMORI
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the WorldSATOSHI TAGOMORI
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDSATOSHI TAGOMORI
 
Data-Driven Development Era and Its Technologies
Data-Driven Development Era and Its TechnologiesData-Driven Development Era and Its Technologies
Data-Driven Development Era and Its TechnologiesSATOSHI TAGOMORI
 
Engineer as a Leading Role
Engineer as a Leading RoleEngineer as a Leading Role
Engineer as a Leading RoleSATOSHI TAGOMORI
 

More from SATOSHI TAGOMORI (13)

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TD
 
Data-Driven Development Era and Its Technologies
Data-Driven Development Era and Its TechnologiesData-Driven Development Era and Its Technologies
Data-Driven Development Era and Its Technologies
 
Engineer as a Leading Role
Engineer as a Leading RoleEngineer as a Leading Role
Engineer as a Leading Role
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Distributed Logging Architecture in Container Era

  • 1. Distributed Logging Architecture in Container Era LinuxCon Japan 2016 at Jun 13 2016 Satoshi "Moris" Tagomori (@tagomoris)
  • 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  • 3.
  • 5. Topics • Microservices and logging in various industries • Difficulties of logging with containers • Distributed logging architecture • Patterns of distributed logging architecture • Case Study: Docker and Fluentd
  • 7. Logging in Various Industries • Web access logs • Views/visitors on media • Views/clicks on Ads • Commercial transactions (EC, Game, ...) • Data from devices • Operation logs on Apps of phones • Various sensor data
  • 8. Microservices and Logging • Monolithic service • a service produces all data about an user's behavior • Microservices • many services produce data about an user's access • it's needed to collect logs from many services to know what is happening Users Service (Application) Logs Users Logs
  • 10. Containers: "a must" for microservices • Dividing a service into services • a service requires less computing resources
 (VM -> containers) • Making services independent from each other • but it is very difficult :( • some dependency must be solved even in development environment
 (containers on desktop)
  • 11. Redesign Logging: Why? • No permanent storages • No fixed physical/network address • No fixed mapping between servers and roles • We should parse/label logs at the source, ship these logs by pushing to destination ASAP
  • 12. Containers: immutable & disposable • No permanent storages • Where to write logs? • files in the container
 → gone w/ container instance 😞 • directories shared from hosts
 → hosts are shared by many containers/services ☹ • TODO: ship logs from container to anywhere ASAP
  • 13. Containers: unfixed addresses • No fixed physical / network address • Where should we go to fetch logs? • Service discovery (e.g., consul)
 → one more component 😞 • rsync? ssh+tail? or ..? Is it installed in containers?
 → one more tool to depend on ☹ • TODO: push logs to anywhere from containers
  • 14. Containers: instances per roles • No fixed mapping between servers and roles • How can we parse / store these logs? • Central repository about log syntax
 → very hard to maintain 😞 • Label logs by source address
 → many containers/roles in a host ☹ • TODO: label & parse logs at source of logs
  • 16. Core Architecture • Collector nodes • Aggregator nodes • Destinations Collector nodes (Docker containers + agent) Destinations
 (Storage, Database, ...) Aggregator nodes
  • 17. • Parse/Label (collector) • Raw logs are not good for processing • Convert logs to structured data (key-value pairs) • Split/Sort (aggregator) • Mixed logs are not good for searching • Split whole data stream into streams per services • Store (destination) • Format logs(records) as destination expects Collecting and Storing Data
  • 18. Scaling Logging • Network traffic • CPU load to parse / format • Parse logs on each collector (distributed) • Format logs on aggregator (to be distributed) • Capability • Make aggregators redundant • Controlling delay • to make sure when we can know what's happening in our systems
  • 21. Source Side Aggregation Patterns w/o source aggregation w/ source aggregation collector aggregator / destination aggregate container
  • 22. Without Source Aggregation • Pros: • Simple configuration • Cons: • fixed aggregator (endpoint) address • many network connections • high load in aggregator collector aggregator
  • 23. With Source Aggregation • Pros: • less connections • lower load in aggregator • less configuration in containers
 (by specifying localhost) • highly flexible configuration
 (by deployment only of aggregate containers) • Cons: • a bit much resource (+1 container per host) aggregate container aggregator
  • 24. Destination Side Aggregation Patterns w/o destination aggregation w/ destination aggregation aggregator collector destination
  • 25. Without Destination Aggregation • Pros: • Less nodes • Simpler configuration • Cons: • Storage side change affects collector side • Worse performance: many small write requests on storage
  • 26. With Destination Aggregation • Pros: • Collector side configuration is
 free from storage side changes • Better performance with fine tune
 on destination side aggregator • Cons: • More nodes • A bit complex configuration aggregator
  • 27. Scaling Patterns Scaling Up Endpoints HTTP/TCP load balancer Huge queue + workers Scaling Out Endpoints Round-robin clients Load balancer Backend nodes Collector nodes Aggregator nodes
  • 28. Scaling Up Endpoints • Pros: • Simple configuration
 in collector nodes • Cons: • Limits about scaling up Load balancer Backend nodes
  • 29. Scaling Out Endpoints • Pros: • Unlimited scaling
 by adding aggregator nodes • Cons: • Complex configuration • Client features for round-robin
  • 30. Without
 Destination Aggregation With
 Destination Aggregation Scaling Up Endpoints Systems in early stages Collecting logs over Internet or Using queues Scaling Out Endpoints Impossible :( Collector nodes must know all endpoints ↓ Uncontrollable Collecting logs in datacenter
  • 32. Case Study: Docker+Fluentd • Destination aggregation + scaling up • Fluent logger + Fluentd • Source aggregation + scaling up • Docker json logger + Fluentd + Elasticsearch • Docker fluentd logger + Fluentd + Kafka • Source/Destination aggregation + scaling out • Docker fluentd logger + Fluentd
  • 33. Why Fluentd? • Docker Fluentd logging driver • Docker containers can send logs to Fluentd directly - less overhead • Pluggable architecture • Various destination systems • Small memory footprint • Source aggregation requires +1 container per host • Less additional resource usage ( < 100MB )
  • 34. Destination aggregation + scaling up • Sending logs directly over TCP by Fluentd logger library in application code • Same with patterns of New Relic • Easy to implement
 - good for startups Application code
  • 35. Source aggregation + scaling up • Kubernetes: Json logger + Fluentd + Elasticsearch • Applications write logs to STDOUT • Docker writes logs as JSON in files • Fluentd
 reads logs from file
 parse JSON objects
 writes logs to Elasticsearch • EFK stack (like ELK stack) http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/ Elasticsearch Application code Files (JSON)
  • 36. Source aggregation + scaling up/out • Docker fluentd logging driver + Fluentd + Kafka • Applications write logs to STDOUT • Docker sends logs
 to localhost Fluentd • Fluentd
 gets logs over TCP
 pushes logs into Kafka • Highly scalable & less overhead
 - very good for huge deployment Kafka Application code
  • 37. Application code Source/Destination aggregation + scaling out • Docker fluentd logging driver + Fluentd • Applications write logs to STDOUT • Docker sends logs
 to localhost Fluentd • Fluentd
 gets logs over TCP
 sends logs into Aggregator Fluentd
 w/ round-robin load balance • Highly flexible
 - good for complex data processing
 requirements Any other storages
  • 38. What's the Best? • Writing logs from containers: Some way to do it • Docker logging driver • Write logs on files + read/parse it • Send logs from apps directly • Make the platform scalable! • Source aggregation: Fluentd on localhost • Scalable storage: (Kafka, external services, ...) • No destination aggregation + Scaling up • Non-scalable storage: (Filesystems, RDBMSs, ...) • Destination aggregation + Scaling out
  • 39. Why OSS Are Important For Logging?
  • 40. Why OSS? • Logging layer is interface • transparency • interoperability • Keep the platform scalable • number of nodes • number of types of source/destination
  • 41. Use OSS, Make Logging Scalable Thank you!