SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
“A needle in a logstack” – analyzing data from
mobile users around the globe to verify a
successful deployment and swift problem
recognition
Tomasz Gągor - Senior System Administrator
Paweł Torbus - System Architect
ELK Stack
03 / 35
Logstash - Log pre-processor
Elasticsearch - Storage, indexing and searching
Kibana - Elasticsearch front-end
Log sources Clients
ELK Stack family
04 / 35
Filebeat - lightweight log
shipper
(logstash-frowarder successor)
Shield - protect access
to elasticserch data with authorization
Watcher - alerting for elasticsearch
Marvel - Elasticsearch monitor
Reporting - generate,
schedule, email reports
Logstash
05 / 35
Written in Ruby, executed in JRuby
Receive logs from agents in numerous formats (49
currently)
Parse them to JSON format
Can render additional data (like geographical
coordinates from IP address, useragent)
Store in Elasticsearch on daily indexes
Can use other “data stores” like Mail, Graphite and
more
All events reside in the singe index
Logstash - deployment
06 / 35
Accepts syslog, JSON and XML data
Parses those to JSON
Store data in singe Elasticsearch cluster
We have up to 6 logstash nodes in single cluster (gathering data from 100
hosts)
Log sources
Logstash - problems we faced
07 / 35
Avoid multilines
Always use date filter to have
proper event timestamps
Be careful naming fields - you
don’t want to overwrite your data
filter{
if [message] == '<!DOCTYPE log SYSTEM
"logger.dtd">' { drop {} }
if [message] == '<log>' { drop {} }
if [message] == '</log>' { drop {} }
if [message] =~ '^ *<' {
multiline {
pattern => "^</record>$"
what => "next"
negate => true
}
xml {
source => "message"
target => "java_xml"
}
date {
match => ["[java_xml][millis]", "UNIX_MS"]
}
Logstash - problems we faced
08 / 35
Syslog is not reliable, favor
logstash-forwarder/filebeat
Always configure dead time in
logstash-forwarder
{"network": {
"servers": [ "logstash1.xxx:5959",
"logstash2.xxx:5959" ],
"timeout": 15,
"ssl ca": "/etc/pki/tls/certs/ca.crt"
},
"files": [{
"paths": [
"/var/log/tomcat6/localhost.*-*-*.log",
"/var/log/tomcat/localhost.*-*-*.log"],
"fields": {
"type": "java_xml",
"forwarder_tags":"tomcat"
},
"dead time": "10m"
},
...
Logstash - problems we faced
09 / 35
Carrefully calibrate memory usage for better performance and stability
LS_HEAP_SIZE="3043m" or -Xmx3043m
Set proper field type in grok to be able to use statistics
grok {
match => [
"message",
# Common Log Format
# "10.0.2.2 - - [22/Apr/2015:11:58:06 +0000] "GET /some/redirect HTTP/1.1" 302 -"
"^%{IP:[access_log][clientip]} - (-|%{USER:[access_log][user]}) [%{HTTPDATE:
[access_log][timestamp]}] "%{DATA:[access_log][request]}" %{POSINT:[access_log]
[status]:int} (%{INT:size:int}|-)$"
]
add_tag => [ "access_log" ]
}
Elasticsearch
10 / 35
Elasticsearch is schemaless Lucene based search engine.
●
Distributed (availability, scalability, durability)
●
JSON-based objects
●
HTTP-based client API
●
Data grouped in Indexes
●
Indexes are stored daily
Elasticsearch - deployment
11 / 3
We have three clusters for two different clients, one internal
They’re in separate subnet and gather logs from all environments (prod/itg/stg/qa/…)
First cluster consists of 3 elasticsearch nodes, 250GB of storage each
Second cluster consists of 5 nodes, 250GB each
Third cluster consists of 6 nodes, 200GB each
11 / 35
Elasticsearch - problems we faced
12 / 35
Configure shards and replicas number according to amount of your machines
Keep track of free disk space and have enough of it
Bigger nodes will synchronise longer
Cluster autodiscovery mostly works, mostly
Disable shard allocation during node restarts/upgrades
Elasticsearch - problems we faced
13 / 35
Allocate max 50% of RAM for elasticsearch
Don’t use more than 32GB RAM for elasticsearch
Set bootstrap.mlockall to true to keep elasticsearch in RAM
Set indices.fielddata.cache.size to any value, ex. 30~40%
Kibana
14 / 35
Kibana is open source analytics and
visualization platform designed to work with
Elasticsearch.
Querying Elasticsearch
Visualizing data
Filtering and inspecting events
Kibana - deployment
15 / 35
Single Kibana instance
Available in version 3.x and 4.x
Clients
Simple HIT/MISS statistics
17 / 35
Varnish traffic per file type
01 / 3
Analyzing log sources
Log types Log sources
Typical index size
Deffinitely too big
indexes
01 / 3
1. Few clicks and it was
clear who’s guilty
2. Adding another table
panel with “terms” filter
we were able to list all
lacking files
3. I can’t imagine simpler
way to nail it
Response time analysis
30 / 35
New search mechanism, full reindex required - did it and how affect performance
of service?
Response time analysis
31 / 35
Kibana allowed us to actually see impact of our changes on application latency.
32 / 35
What’s that hour long hole?
Do we have it more than once?
We did…But what’s this?
Thank you for attention

Más contenido relacionado

La actualidad más candente

Teradata online training
Teradata online trainingTeradata online training
Teradata online trainingMonster Courses
 
LiteDB - A .NET NoSQL Document Store in a single data file
LiteDB - A .NET NoSQL Document Store in a single data fileLiteDB - A .NET NoSQL Document Store in a single data file
LiteDB - A .NET NoSQL Document Store in a single data fileLarry Nung
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph Molly Struve
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafkaZach Cox
 
From Overnight to Always On @ Jfokus 2019
From Overnight to Always On @ Jfokus 2019From Overnight to Always On @ Jfokus 2019
From Overnight to Always On @ Jfokus 2019Enno Runne
 
Apache Tajo on Swift: Bringing SQL to the OpenStack World
Apache Tajo on Swift: Bringing SQL to the OpenStack WorldApache Tajo on Swift: Bringing SQL to the OpenStack World
Apache Tajo on Swift: Bringing SQL to the OpenStack WorldJihoon Son
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Hybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioHybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioThai Bui
 
Using Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataUsing Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataRob Gardner
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...DataStax
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataJihoon Son
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...DataStax
 
Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)Huy Do
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Infinum Android Talks #18 - How to cache like a boss by Željko Plesac
Infinum Android Talks #18 - How to cache like a boss by Željko PlesacInfinum Android Talks #18 - How to cache like a boss by Željko Plesac
Infinum Android Talks #18 - How to cache like a boss by Željko PlesacInfinum
 
Performance evaluation of apache tajo
Performance evaluation of apache tajoPerformance evaluation of apache tajo
Performance evaluation of apache tajoJihoon Son
 
Webtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalabilityWebtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalabilityLuca Bonmassar
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyondMatija Gobec
 

La actualidad más candente (20)

Teradata online training
Teradata online trainingTeradata online training
Teradata online training
 
LiteDB - A .NET NoSQL Document Store in a single data file
LiteDB - A .NET NoSQL Document Store in a single data fileLiteDB - A .NET NoSQL Document Store in a single data file
LiteDB - A .NET NoSQL Document Store in a single data file
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafka
 
Java volatile
Java volatileJava volatile
Java volatile
 
From Overnight to Always On @ Jfokus 2019
From Overnight to Always On @ Jfokus 2019From Overnight to Always On @ Jfokus 2019
From Overnight to Always On @ Jfokus 2019
 
Apache Tajo on Swift: Bringing SQL to the OpenStack World
Apache Tajo on Swift: Bringing SQL to the OpenStack WorldApache Tajo on Swift: Bringing SQL to the OpenStack World
Apache Tajo on Swift: Bringing SQL to the OpenStack World
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Hybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioHybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxio
 
Using Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider DataUsing Ceph for Large Hadron Collider Data
Using Ceph for Large Hadron Collider Data
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
 
Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Infinum Android Talks #18 - How to cache like a boss by Željko Plesac
Infinum Android Talks #18 - How to cache like a boss by Željko PlesacInfinum Android Talks #18 - How to cache like a boss by Željko Plesac
Infinum Android Talks #18 - How to cache like a boss by Željko Plesac
 
Performance evaluation of apache tajo
Performance evaluation of apache tajoPerformance evaluation of apache tajo
Performance evaluation of apache tajo
 
Webtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalabilityWebtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalability
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyond
 

Destacado

JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataPROIDEA
 
Do you think you're doing microservice architecture? What about infrastructur...
Do you think you're doing microservice architecture? What about infrastructur...Do you think you're doing microservice architecture? What about infrastructur...
Do you think you're doing microservice architecture? What about infrastructur...Marcin Grzejszczak
 
PLNOG 18 - Michał Sajdak - IoT hacking w praktyce
PLNOG 18 - Michał Sajdak - IoT hacking w praktycePLNOG 18 - Michał Sajdak - IoT hacking w praktyce
PLNOG 18 - Michał Sajdak - IoT hacking w praktycePROIDEA
 
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...PROIDEA
 
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMH
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMHJDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMH
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMHPROIDEA
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
 
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...PROIDEA
 
PLNOG 18 - Marcin Kuczera- ONT idealny
PLNOG 18 - Marcin Kuczera- ONT idealny PLNOG 18 - Marcin Kuczera- ONT idealny
PLNOG 18 - Marcin Kuczera- ONT idealny PROIDEA
 
4Developers: Aplikacja od SaaSa do IdaaSa
4Developers: Aplikacja od SaaSa do IdaaSa4Developers: Aplikacja od SaaSa do IdaaSa
4Developers: Aplikacja od SaaSa do IdaaSaTomek Onyszko
 

Destacado (9)

JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
 
Do you think you're doing microservice architecture? What about infrastructur...
Do you think you're doing microservice architecture? What about infrastructur...Do you think you're doing microservice architecture? What about infrastructur...
Do you think you're doing microservice architecture? What about infrastructur...
 
PLNOG 18 - Michał Sajdak - IoT hacking w praktyce
PLNOG 18 - Michał Sajdak - IoT hacking w praktycePLNOG 18 - Michał Sajdak - IoT hacking w praktyce
PLNOG 18 - Michał Sajdak - IoT hacking w praktyce
 
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...
4Developers: Mateusz Stasch- Domain Events - czyli jak radzić sobie z rzeczyw...
 
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMH
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMHJDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMH
JDD 2016 - Wojciech Oczkowski - Testowanie Wydajnosci Za Pomoca Narzedzia JMH
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
 
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...
PLNOG 18 - Alan Kuczewski - "Mały może więcej" - Rozwiązanie wysokiej dostępn...
 
PLNOG 18 - Marcin Kuczera- ONT idealny
PLNOG 18 - Marcin Kuczera- ONT idealny PLNOG 18 - Marcin Kuczera- ONT idealny
PLNOG 18 - Marcin Kuczera- ONT idealny
 
4Developers: Aplikacja od SaaSa do IdaaSa
4Developers: Aplikacja od SaaSa do IdaaSa4Developers: Aplikacja od SaaSa do IdaaSa
4Developers: Aplikacja od SaaSa do IdaaSa
 

Similar a JDD 2016 - Tomasz Gagor, Pawel Torbus - A Needle In A Logstack

Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3uzzal basak
 
Monitor all the things - Confoo
Monitor all the things - ConfooMonitor all the things - Confoo
Monitor all the things - Confoofelixtrepanier
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Elk with Openstack
Elk with OpenstackElk with Openstack
Elk with OpenstackArun prasath
 
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.Airat Khisamov
 
Scaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngScaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngPeter Czanik
 
Scaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGScaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGAll Things Open
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysisDhaval Mehta
 
Tim eberhard bajug3_talk
Tim eberhard bajug3_talkTim eberhard bajug3_talk
Tim eberhard bajug3_talkTim Eberhard
 
Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travixRuslan Lutsenko
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...ForgeRock
 
Privilege Escalation with Metasploit
Privilege Escalation with MetasploitPrivilege Escalation with Metasploit
Privilege Escalation with Metasploitegypt
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Alfresco monitoring with Nagios and ELK stack
Alfresco monitoring with Nagios and ELK stackAlfresco monitoring with Nagios and ELK stack
Alfresco monitoring with Nagios and ELK stackCesar Capillas
 
Sql introduction
Sql introductionSql introduction
Sql introductionvimal_guru
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupAnoop Vijayan
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeMarco Gralike
 

Similar a JDD 2016 - Tomasz Gagor, Pawel Torbus - A Needle In A Logstack (20)

Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3
 
Monitor all the things - Confoo
Monitor all the things - ConfooMonitor all the things - Confoo
Monitor all the things - Confoo
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Elk with Openstack
Elk with OpenstackElk with Openstack
Elk with Openstack
 
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
 
Scaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngScaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ng
 
Scaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGScaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NG
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysis
 
Tim eberhard bajug3_talk
Tim eberhard bajug3_talkTim eberhard bajug3_talk
Tim eberhard bajug3_talk
 
Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travix
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Privilege Escalation with Metasploit
Privilege Escalation with MetasploitPrivilege Escalation with Metasploit
Privilege Escalation with Metasploit
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Elk scilifelab
Elk scilifelabElk scilifelab
Elk scilifelab
 
Alfresco monitoring with Nagios and ELK stack
Alfresco monitoring with Nagios and ELK stackAlfresco monitoring with Nagios and ELK stack
Alfresco monitoring with Nagios and ELK stack
 
Sql introduction
Sql introductionSql introduction
Sql introduction
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetup
 
Stress test data pipeline
Stress test data pipelineStress test data pipeline
Stress test data pipeline
 
Oracle Tracing
Oracle TracingOracle Tracing
Oracle Tracing
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
 

Último

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

JDD 2016 - Tomasz Gagor, Pawel Torbus - A Needle In A Logstack

  • 1.
  • 2. “A needle in a logstack” – analyzing data from mobile users around the globe to verify a successful deployment and swift problem recognition Tomasz Gągor - Senior System Administrator Paweł Torbus - System Architect
  • 3. ELK Stack 03 / 35 Logstash - Log pre-processor Elasticsearch - Storage, indexing and searching Kibana - Elasticsearch front-end Log sources Clients
  • 4. ELK Stack family 04 / 35 Filebeat - lightweight log shipper (logstash-frowarder successor) Shield - protect access to elasticserch data with authorization Watcher - alerting for elasticsearch Marvel - Elasticsearch monitor Reporting - generate, schedule, email reports
  • 5. Logstash 05 / 35 Written in Ruby, executed in JRuby Receive logs from agents in numerous formats (49 currently) Parse them to JSON format Can render additional data (like geographical coordinates from IP address, useragent) Store in Elasticsearch on daily indexes Can use other “data stores” like Mail, Graphite and more All events reside in the singe index
  • 6. Logstash - deployment 06 / 35 Accepts syslog, JSON and XML data Parses those to JSON Store data in singe Elasticsearch cluster We have up to 6 logstash nodes in single cluster (gathering data from 100 hosts) Log sources
  • 7. Logstash - problems we faced 07 / 35 Avoid multilines Always use date filter to have proper event timestamps Be careful naming fields - you don’t want to overwrite your data filter{ if [message] == '<!DOCTYPE log SYSTEM "logger.dtd">' { drop {} } if [message] == '<log>' { drop {} } if [message] == '</log>' { drop {} } if [message] =~ '^ *<' { multiline { pattern => "^</record>$" what => "next" negate => true } xml { source => "message" target => "java_xml" } date { match => ["[java_xml][millis]", "UNIX_MS"] }
  • 8. Logstash - problems we faced 08 / 35 Syslog is not reliable, favor logstash-forwarder/filebeat Always configure dead time in logstash-forwarder {"network": { "servers": [ "logstash1.xxx:5959", "logstash2.xxx:5959" ], "timeout": 15, "ssl ca": "/etc/pki/tls/certs/ca.crt" }, "files": [{ "paths": [ "/var/log/tomcat6/localhost.*-*-*.log", "/var/log/tomcat/localhost.*-*-*.log"], "fields": { "type": "java_xml", "forwarder_tags":"tomcat" }, "dead time": "10m" }, ...
  • 9. Logstash - problems we faced 09 / 35 Carrefully calibrate memory usage for better performance and stability LS_HEAP_SIZE="3043m" or -Xmx3043m Set proper field type in grok to be able to use statistics grok { match => [ "message", # Common Log Format # "10.0.2.2 - - [22/Apr/2015:11:58:06 +0000] "GET /some/redirect HTTP/1.1" 302 -" "^%{IP:[access_log][clientip]} - (-|%{USER:[access_log][user]}) [%{HTTPDATE: [access_log][timestamp]}] "%{DATA:[access_log][request]}" %{POSINT:[access_log] [status]:int} (%{INT:size:int}|-)$" ] add_tag => [ "access_log" ] }
  • 10. Elasticsearch 10 / 35 Elasticsearch is schemaless Lucene based search engine. ● Distributed (availability, scalability, durability) ● JSON-based objects ● HTTP-based client API ● Data grouped in Indexes ● Indexes are stored daily
  • 11. Elasticsearch - deployment 11 / 3 We have three clusters for two different clients, one internal They’re in separate subnet and gather logs from all environments (prod/itg/stg/qa/…) First cluster consists of 3 elasticsearch nodes, 250GB of storage each Second cluster consists of 5 nodes, 250GB each Third cluster consists of 6 nodes, 200GB each 11 / 35
  • 12. Elasticsearch - problems we faced 12 / 35 Configure shards and replicas number according to amount of your machines Keep track of free disk space and have enough of it Bigger nodes will synchronise longer Cluster autodiscovery mostly works, mostly Disable shard allocation during node restarts/upgrades
  • 13. Elasticsearch - problems we faced 13 / 35 Allocate max 50% of RAM for elasticsearch Don’t use more than 32GB RAM for elasticsearch Set bootstrap.mlockall to true to keep elasticsearch in RAM Set indices.fielddata.cache.size to any value, ex. 30~40%
  • 14. Kibana 14 / 35 Kibana is open source analytics and visualization platform designed to work with Elasticsearch. Querying Elasticsearch Visualizing data Filtering and inspecting events
  • 15. Kibana - deployment 15 / 35 Single Kibana instance Available in version 3.x and 4.x Clients
  • 16.
  • 18. Varnish traffic per file type 01 / 3
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Analyzing log sources Log types Log sources
  • 28.
  • 29. 01 / 3 1. Few clicks and it was clear who’s guilty 2. Adding another table panel with “terms” filter we were able to list all lacking files 3. I can’t imagine simpler way to nail it
  • 30. Response time analysis 30 / 35 New search mechanism, full reindex required - did it and how affect performance of service?
  • 31. Response time analysis 31 / 35 Kibana allowed us to actually see impact of our changes on application latency.
  • 32. 32 / 35 What’s that hour long hole? Do we have it more than once?
  • 34.
  • 35. Thank you for attention