SlideShare a Scribd company logo
1 of 33
Download to read offline
SCALING YOUR
LOGGING INFRASTRUCTURE
WITH SYSLOG-NG
All Things Open 2016
Peter Czanik / Balabit
2
ABOUT ME
 Peter Czanik from Hungary
 Community Manager at Balabit: syslog-ng
upstream
 syslog-ng packaging, support, advocacy
Balabit is an IT security company with development
HQ in Budapest, Hungary
Over 200 employees: the majority are engineers
3
syslog-ng
Logging
Recording events, such as:
Jan 14 11:38:48 linux-0jbu sshd[7716]: Accepted publickey for root
from 127.0.0.1 port 48806 ssh2
syslog-ng
Enhanced logging daemon with a focus on high-performance
central log collection.
4
WHY CENTRAL LOGGING?
EASE OF USE
one place to check
instead of many
AVAILABILITY
even if the sender
machine is down
SECURITY
logs are available even
if sender machine
is compromised
5
MAIN SYSLOG-NG ROLES
collector processor filter storage
(or forwarder)
6
ROLE: DATA COLLECTOR
Collect system and application logs together:
contextual data for either side
A wide variety of platform-specific sources:
 /dev/log & co
 Journal, Sun streams
Receive syslog messages over the network:
 Legacy or RFC5424, UDP/TCP/TLS
Logs or any kind of data from applications:
 Through files, sockets, pipes, etc.
 Application output
7
ROLE: PROCESSING
Classify, normalize and structure logs with built-in parsers:
 CSV-parser, DB-parser (PatternDB), JSON parser, key=value
parser and more to come
Rewrite messages:
 For example anonymization
Reformatting messages using templates:
 Destination might need a specific format (ISO date, JSON, etc.)
Enrich data:
 GeoIP
 Additional fields based on message content
8
ROLE: DATA FILTERING
Main uses:
 Discarding surplus logs (not storing debug level messages)
 Message routing (login events to SIEM)
Many possibilities:
 Based on message content, parameters or macros
 Using comparisons, wildcards, regular expressions and
functions
 Combining all of these with Boolean operators
9
ROLE: DESTINATIONS
“TRADITIONAL”
● File, network, TLS, SQL, etc.
“BIG DATA”
● Distributed file systems:
● Hadoop
● NoSQL databases:
● MongoDB
● Elasticsearch
● Messaging systems:
● Kafka
10
WHICH SYSLOG-NG
VERSION IS THE FASTEST?
 Project started in 1998
 Version 3.3 added multi-threading in 2011
 Latest stable version is 3.8, released two
months ago
11
Kindle e-book reader
on a plane :)
Version 1.6
BMW i3 all electric car
Version 3.4
12
FREE-FORM LOG MESSAGES
Most log messages are: date + hostname + text
Mar 11 13:37:56 linux-6965 sshd[4547]: Accepted
keyboard-interactive/pam for root from 127.0.0.1 port
46048 ssh2
 Text = English sentence with some variable parts
 Easy to read by a human
 Difficult to process them with scripts
13
SOLUTION: STRUCTURED LOGGING
 Events represented as name-value pairs
 Example: an ssh login:
app=sshd user=root source_ip=192.168.123.45
 syslog-ng: name-value pairs inside
 Date, facility, priority, program name, pid, etc.
 Parsers in syslog-ng can turn unstructured and some structured data (CSV,
JSON) into name-value pairs
14
JSON PARSER
Turns JSON-based log messages into name-value pairs
{"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"seq:
0000000000, thread: 0000, runid: 1374490607, stamp: 2013-07-22T12:56:47
MESSAGE... ","HOST":"localhost","FACILITY":"auth","DATE":"Jul 22 12:56:47"}
15
CSV PARSER
Parses columnar data into fields
parser p_apache {
csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME",
"APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS",
"APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT",
"APACHE.PROCESS_TIME", "APACHE.SERVER_NAME")
flags(escape-double-char,strip-whitespace) delimiters(" ") quote-pairs('""[]')
);
};
destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); };
log { source(s_local); parser(p_apache); destination(d_file);};
16
KEY=VALUE PARSER
Finds key=value pairs in messages
Introduced in version 3.7.
Typical in firewalls, like:
2016-03-04T07:10:19-05:00 127.0.0.1 zorp/http[3486]: core.summary(4):
(svc/http#0/http/intraPLUGinter:267346): Connection summary; rule_id='51',
session_start='1451980783', session_end='1451980784', client_proto='TCP',
client_address='172.168.65.4', client_port='56084', client_zone='office',
server_proto='TCP', server_address='173.252.120.68', server_port='443',
server_zone='internet', client_local='173.252.120.68', client_local_port='443',
server_local='91.120.23.97', server_local_port='46472', verdict='ACCEPTED', info=''
17
PATTERNDB PARSER
PatternDB message parser
Can extract useful information from unstructured messages into name-value pairs
 Add status fields based on message text
 Message classification (like LogCheck)
Needs XML describing log messages
Example: an ssh login failure:
 Parsed: app=sshd, user=root, source_ip=192.168.123.45
 Added: action=login, status=failure
 Classified as “violation”
18
PARSERS WRITTEN IN RUST
Rust parsers
 Rust: https://www.rust-lang.org/
 Supported from 3.8
Example modules in separate repository: https://github.com/balabit/syslog-ng-rust-modules/
 Regexp parser
 Actiondb parser (similar to, but easier to use than patterndb)
 Correlation parser
19
ANONYMIZING MESSAGES
Many regulations about what can be logged
 PCI-DSS: credit card numbers
 Europe: IP addresses, user names
Locating sensitive information:
 Regular expression: slow, works also in unknown logs
 Patterndb: fast, works only in known log messages
Anonymizing:
 Overwrite it with a constant
 Overwrite it with a hash of the original
20
CONFIGURATION
 “Don't Panic”
 Simple and logical, even if it looks difficult at first
 Pipeline model:
 Many different building blocks (sources, destinations,
filters, parsers, etc.)
 Connected into a pipeline using “log” statements
21
syslog-ng.conf: global options
@version:3.7
@include "scl.conf"
# this is a comment :)
options {
flush_lines (0);
# [...]
keep_hostname (yes);
};
22
syslog-ng.conf: sources
source s_sys {
system();
internal();
};
source s_net {
udp(ip(0.0.0.0) port(514));
};
23
syslog-ng.conf: destinations
destination d_mesg { file("/var/log/messages"); };
destination d_es {
elasticsearch(
index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
type("test")
cluster("syslog-ng")
template("$(format-json --scope rfc3164 --scope nv-pairs --exclude R_DATE --key ISODATE)n");
);
};
24
syslog-ng.conf: filters, parsers
filter f_nodebug { level(info..emerg); };
filter f_messages { level(info..emerg) and
not (facility(mail)
or facility(authpriv)
or facility(cron)); };
parser pattern_db {
db-parser(file("/opt/syslog-ng/etc/patterndb.xml") );
};
25
syslog-ng.conf: logpath
log { source(s_sys); filter(f_messages); destination(d_mesg); };
log {
source(s_net);
source(s_sys);
filter(f_nodebug);
parser(pattern_db);
destination(d_es);
flags(flow-control);
};
26
Patterndb & ElasticSearch & Kibana
27
SCALING SYSLOG-NG
 Client – Relay – Server instead of Client – Server
 Distribute some of the processing to Client/Relay
28
LOG ROUTING
 Based on filtering
 Send the right logs to the right places
 Message parsing can increase accuracy
 E-mail on root logins
 Can optimize SIEM / log analyzer tools
 Only relevant messages: cheaper licensing
 Throttling: evening out peaks
29
WHAT IS NEW IN SYSLOG-NG 3.8
 Disk-based buffering
 Grouping-by(): correlation independent
of patterndb
 Parsers written in Rust
 Elasticsearch 2.x support
 Curl (HTTP) destination
 Performance improvements
 Many more :-)
30
SYSLOG-NG BENEFITS
FOR LARGE ENVIRONMENTS
High-performance
reliable log collection
Simplified
architecture
Single application for both
syslog and application data
Easier-to-use data
Parsed and presented in a
ready-to-use format
Lower load on
destinations
Efficient message filtering
and routing
31
JOINING THE COMMUNITY
 syslog-ng: http://syslog-ng.org/
 Source on GitHub: https://github.com/balabit/syslog-ng
 Mailing list: https://lists.balabit.hu/pipermail/syslog-ng/
 IRC: #syslog-ng on freenode
32
QUESTIONS?
My blog: http://czanik.blogs.balabit.com/
My e-mail: peter.czanik@balabit.com
Twitter: https://twitter.com/PCzanik
33
SAMPLE XML
● <?xml version='1.0' encoding='UTF-8'?>
● <patterndb version='3' pub_date='2010-07-13'>
● <ruleset name='opensshd' id='2448293e-6d1c-412c-a418-a80025639511'>
● <pattern>sshd</pattern>
● <rules>
● <rule provider="patterndb" id="4dd5a329-da83-4876-a431-ddcb59c2858c" class="system">
● <patterns>
● <pattern>Accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING::
@@ANYSTRING:usracct.service@</pattern>
● </patterns>
● <examples>
● <example>
● <test_message program="sshd">Accepted password for bazsi from 127.0.0.1 port 48650 ssh2</test_message>
● <test_values>
● <test_value name="usracct.username">bazsi</test_value>
● <test_value name="usracct.authmethod">password</test_value>
● <test_value name="usracct.device">127.0.0.1</test_value>
● <test_value name="usracct.service">ssh2</test_value>
● </test_values>
● </example>
● </examples>
● <values>
● <value name="usracct.type">login</value>
● <value name="usracct.sessionid">$PID</value>
● <value name="usracct.application">$PROGRAM</value>
● <value name="secevt.verdict">ACCEPT</value>
● </values>
● </rule>

More Related Content

What's hot

Trac Project And Process Management For Developers And Sys Admins Presentation
Trac  Project And Process Management For Developers And Sys Admins PresentationTrac  Project And Process Management For Developers And Sys Admins Presentation
Trac Project And Process Management For Developers And Sys Admins Presentation
guest3fc4fa
 
HTML5 Programming
HTML5 ProgrammingHTML5 Programming
HTML5 Programming
hotrannam
 

What's hot (20)

Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
Move Over, Rsync
Move Over, RsyncMove Over, Rsync
Move Over, Rsync
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELK
 
In the Wake of Kerberoast
In the Wake of KerberoastIn the Wake of Kerberoast
In the Wake of Kerberoast
 
LogStash in action
LogStash in actionLogStash in action
LogStash in action
 
Monitoring Docker with ELK
Monitoring Docker with ELKMonitoring Docker with ELK
Monitoring Docker with ELK
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Testing Wi-Fi with OSS Tools
Testing Wi-Fi with OSS ToolsTesting Wi-Fi with OSS Tools
Testing Wi-Fi with OSS Tools
 
Logstash
LogstashLogstash
Logstash
 
OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA Testers
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
 
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, KibanaLogging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
 
From pets to cattle - powered by CoreOS, docker, Mesos & nginx
From pets to cattle - powered by CoreOS, docker, Mesos & nginxFrom pets to cattle - powered by CoreOS, docker, Mesos & nginx
From pets to cattle - powered by CoreOS, docker, Mesos & nginx
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)
 
Open Source Logging and Monitoring Tools
Open Source Logging and Monitoring ToolsOpen Source Logging and Monitoring Tools
Open Source Logging and Monitoring Tools
 
Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...
Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...
Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...
 
Trac Project And Process Management For Developers And Sys Admins Presentation
Trac  Project And Process Management For Developers And Sys Admins PresentationTrac  Project And Process Management For Developers And Sys Admins Presentation
Trac Project And Process Management For Developers And Sys Admins Presentation
 
HTML5 Programming
HTML5 ProgrammingHTML5 Programming
HTML5 Programming
 

Similar to Scaling Your Logging Infrastructure With Syslog-NG

SCaLE 2016 - syslog-ng: From Raw Data to Big Data
SCaLE 2016 - syslog-ng: From Raw Data to Big DataSCaLE 2016 - syslog-ng: From Raw Data to Big Data
SCaLE 2016 - syslog-ng: From Raw Data to Big Data
BalaBit
 
syslog-ng: from log collection to processing and information extraction
syslog-ng: from log collection to processing and information extractionsyslog-ng: from log collection to processing and information extraction
syslog-ng: from log collection to processing and information extraction
BalaBit
 

Similar to Scaling Your Logging Infrastructure With Syslog-NG (20)

SCaLE 2016 - syslog-ng: From Raw Data to Big Data
SCaLE 2016 - syslog-ng: From Raw Data to Big DataSCaLE 2016 - syslog-ng: From Raw Data to Big Data
SCaLE 2016 - syslog-ng: From Raw Data to Big Data
 
2015. Libre Software Meeting - syslog-ng: from log collection to processing a...
2015. Libre Software Meeting - syslog-ng: from log collection to processing a...2015. Libre Software Meeting - syslog-ng: from log collection to processing a...
2015. Libre Software Meeting - syslog-ng: from log collection to processing a...
 
LOADays 2015 - syslog-ng - from log collection to processing and infomation e...
LOADays 2015 - syslog-ng - from log collection to processing and infomation e...LOADays 2015 - syslog-ng - from log collection to processing and infomation e...
LOADays 2015 - syslog-ng - from log collection to processing and infomation e...
 
syslog-ng: from log collection to processing and information extraction
syslog-ng: from log collection to processing and information extractionsyslog-ng: from log collection to processing and information extraction
syslog-ng: from log collection to processing and information extraction
 
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
 
GrayLog for Java developers FOSDEM 2018
GrayLog for Java developers FOSDEM 2018GrayLog for Java developers FOSDEM 2018
GrayLog for Java developers FOSDEM 2018
 
PostgresOpen 2013 A Comparison of PostgreSQL Encryption Options
PostgresOpen 2013 A Comparison of PostgreSQL Encryption OptionsPostgresOpen 2013 A Comparison of PostgreSQL Encryption Options
PostgresOpen 2013 A Comparison of PostgreSQL Encryption Options
 
Syslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress GuideSyslog Centralization Logging with Windows ~ A techXpress Guide
Syslog Centralization Logging with Windows ~ A techXpress Guide
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Kamailio - Secure Communication
Kamailio - Secure CommunicationKamailio - Secure Communication
Kamailio - Secure Communication
 
Hunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentationHunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentation
 
Ravi kumar
Ravi kumarRavi kumar
Ravi kumar
 
Fedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 TalkFedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 Talk
 
D4 Project Presentation
D4 Project PresentationD4 Project Presentation
D4 Project Presentation
 
Advanced Log Processing
Advanced Log ProcessingAdvanced Log Processing
Advanced Log Processing
 
Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3
 
WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202
 
Secure network
Secure networkSecure network
Secure network
 
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache PulsarCODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
 
Low cost multi-sensor IDS system
Low cost multi-sensor IDS systemLow cost multi-sensor IDS system
Low cost multi-sensor IDS system
 

More from All Things Open

Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public Policy
All Things Open
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
All Things Open
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart Contract
All Things Open
 
Scaling Web Applications with Background
Scaling Web Applications with BackgroundScaling Web Applications with Background
Scaling Web Applications with Background
All Things Open
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open Source
All Things Open
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in control
All Things Open
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
All Things Open
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
All Things Open
 

More from All Things Open (20)

Building Reliability - The Realities of Observability
Building Reliability - The Realities of ObservabilityBuilding Reliability - The Realities of Observability
Building Reliability - The Realities of Observability
 
Modern Database Best Practices
Modern Database Best PracticesModern Database Best Practices
Modern Database Best Practices
 
Open Source and Public Policy
Open Source and Public PolicyOpen Source and Public Policy
Open Source and Public Policy
 
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...
 
The State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil NashThe State of Passwordless Auth on the Web - Phil Nash
The State of Passwordless Auth on the Web - Phil Nash
 
Total ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScriptTotal ReDoS: The dangers of regex in JavaScript
Total ReDoS: The dangers of regex in JavaScript
 
What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?What Does Real World Mass Adoption of Decentralized Tech Look Like?
What Does Real World Mass Adoption of Decentralized Tech Look Like?
 
How to Write & Deploy a Smart Contract
How to Write & Deploy a Smart ContractHow to Write & Deploy a Smart Contract
How to Write & Deploy a Smart Contract
 
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
 
DEI Challenges and Success
DEI Challenges and SuccessDEI Challenges and Success
DEI Challenges and Success
 
Scaling Web Applications with Background
Scaling Web Applications with BackgroundScaling Web Applications with Background
Scaling Web Applications with Background
 
Supercharging tutorials with WebAssembly
Supercharging tutorials with WebAssemblySupercharging tutorials with WebAssembly
Supercharging tutorials with WebAssembly
 
Using SQL to Find Needles in Haystacks
Using SQL to Find Needles in HaystacksUsing SQL to Find Needles in Haystacks
Using SQL to Find Needles in Haystacks
 
Configuration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit InterceptConfiguration Security as a Game of Pursuit Intercept
Configuration Security as a Game of Pursuit Intercept
 
Scaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship ProgramScaling an Open Source Sponsorship Program
Scaling an Open Source Sponsorship Program
 
Build Developer Experience Teams for Open Source
Build Developer Experience Teams for Open SourceBuild Developer Experience Teams for Open Source
Build Developer Experience Teams for Open Source
 
Deploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache BeamDeploying Models at Scale with Apache Beam
Deploying Models at Scale with Apache Beam
 
Sudo – Giving access while staying in control
Sudo – Giving access while staying in controlSudo – Giving access while staying in control
Sudo – Giving access while staying in control
 
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsFortifying the Future: Tackling Security Challenges in AI/ML Applications
Fortifying the Future: Tackling Security Challenges in AI/ML Applications
 
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Scaling Your Logging Infrastructure With Syslog-NG

  • 1. SCALING YOUR LOGGING INFRASTRUCTURE WITH SYSLOG-NG All Things Open 2016 Peter Czanik / Balabit
  • 2. 2 ABOUT ME  Peter Czanik from Hungary  Community Manager at Balabit: syslog-ng upstream  syslog-ng packaging, support, advocacy Balabit is an IT security company with development HQ in Budapest, Hungary Over 200 employees: the majority are engineers
  • 3. 3 syslog-ng Logging Recording events, such as: Jan 14 11:38:48 linux-0jbu sshd[7716]: Accepted publickey for root from 127.0.0.1 port 48806 ssh2 syslog-ng Enhanced logging daemon with a focus on high-performance central log collection.
  • 4. 4 WHY CENTRAL LOGGING? EASE OF USE one place to check instead of many AVAILABILITY even if the sender machine is down SECURITY logs are available even if sender machine is compromised
  • 5. 5 MAIN SYSLOG-NG ROLES collector processor filter storage (or forwarder)
  • 6. 6 ROLE: DATA COLLECTOR Collect system and application logs together: contextual data for either side A wide variety of platform-specific sources:  /dev/log & co  Journal, Sun streams Receive syslog messages over the network:  Legacy or RFC5424, UDP/TCP/TLS Logs or any kind of data from applications:  Through files, sockets, pipes, etc.  Application output
  • 7. 7 ROLE: PROCESSING Classify, normalize and structure logs with built-in parsers:  CSV-parser, DB-parser (PatternDB), JSON parser, key=value parser and more to come Rewrite messages:  For example anonymization Reformatting messages using templates:  Destination might need a specific format (ISO date, JSON, etc.) Enrich data:  GeoIP  Additional fields based on message content
  • 8. 8 ROLE: DATA FILTERING Main uses:  Discarding surplus logs (not storing debug level messages)  Message routing (login events to SIEM) Many possibilities:  Based on message content, parameters or macros  Using comparisons, wildcards, regular expressions and functions  Combining all of these with Boolean operators
  • 9. 9 ROLE: DESTINATIONS “TRADITIONAL” ● File, network, TLS, SQL, etc. “BIG DATA” ● Distributed file systems: ● Hadoop ● NoSQL databases: ● MongoDB ● Elasticsearch ● Messaging systems: ● Kafka
  • 10. 10 WHICH SYSLOG-NG VERSION IS THE FASTEST?  Project started in 1998  Version 3.3 added multi-threading in 2011  Latest stable version is 3.8, released two months ago
  • 11. 11 Kindle e-book reader on a plane :) Version 1.6 BMW i3 all electric car Version 3.4
  • 12. 12 FREE-FORM LOG MESSAGES Most log messages are: date + hostname + text Mar 11 13:37:56 linux-6965 sshd[4547]: Accepted keyboard-interactive/pam for root from 127.0.0.1 port 46048 ssh2  Text = English sentence with some variable parts  Easy to read by a human  Difficult to process them with scripts
  • 13. 13 SOLUTION: STRUCTURED LOGGING  Events represented as name-value pairs  Example: an ssh login: app=sshd user=root source_ip=192.168.123.45  syslog-ng: name-value pairs inside  Date, facility, priority, program name, pid, etc.  Parsers in syslog-ng can turn unstructured and some structured data (CSV, JSON) into name-value pairs
  • 14. 14 JSON PARSER Turns JSON-based log messages into name-value pairs {"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"seq: 0000000000, thread: 0000, runid: 1374490607, stamp: 2013-07-22T12:56:47 MESSAGE... ","HOST":"localhost","FACILITY":"auth","DATE":"Jul 22 12:56:47"}
  • 15. 15 CSV PARSER Parses columnar data into fields parser p_apache { csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME", "APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS", "APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT", "APACHE.PROCESS_TIME", "APACHE.SERVER_NAME") flags(escape-double-char,strip-whitespace) delimiters(" ") quote-pairs('""[]') ); }; destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); }; log { source(s_local); parser(p_apache); destination(d_file);};
  • 16. 16 KEY=VALUE PARSER Finds key=value pairs in messages Introduced in version 3.7. Typical in firewalls, like: 2016-03-04T07:10:19-05:00 127.0.0.1 zorp/http[3486]: core.summary(4): (svc/http#0/http/intraPLUGinter:267346): Connection summary; rule_id='51', session_start='1451980783', session_end='1451980784', client_proto='TCP', client_address='172.168.65.4', client_port='56084', client_zone='office', server_proto='TCP', server_address='173.252.120.68', server_port='443', server_zone='internet', client_local='173.252.120.68', client_local_port='443', server_local='91.120.23.97', server_local_port='46472', verdict='ACCEPTED', info=''
  • 17. 17 PATTERNDB PARSER PatternDB message parser Can extract useful information from unstructured messages into name-value pairs  Add status fields based on message text  Message classification (like LogCheck) Needs XML describing log messages Example: an ssh login failure:  Parsed: app=sshd, user=root, source_ip=192.168.123.45  Added: action=login, status=failure  Classified as “violation”
  • 18. 18 PARSERS WRITTEN IN RUST Rust parsers  Rust: https://www.rust-lang.org/  Supported from 3.8 Example modules in separate repository: https://github.com/balabit/syslog-ng-rust-modules/  Regexp parser  Actiondb parser (similar to, but easier to use than patterndb)  Correlation parser
  • 19. 19 ANONYMIZING MESSAGES Many regulations about what can be logged  PCI-DSS: credit card numbers  Europe: IP addresses, user names Locating sensitive information:  Regular expression: slow, works also in unknown logs  Patterndb: fast, works only in known log messages Anonymizing:  Overwrite it with a constant  Overwrite it with a hash of the original
  • 20. 20 CONFIGURATION  “Don't Panic”  Simple and logical, even if it looks difficult at first  Pipeline model:  Many different building blocks (sources, destinations, filters, parsers, etc.)  Connected into a pipeline using “log” statements
  • 21. 21 syslog-ng.conf: global options @version:3.7 @include "scl.conf" # this is a comment :) options { flush_lines (0); # [...] keep_hostname (yes); };
  • 22. 22 syslog-ng.conf: sources source s_sys { system(); internal(); }; source s_net { udp(ip(0.0.0.0) port(514)); };
  • 23. 23 syslog-ng.conf: destinations destination d_mesg { file("/var/log/messages"); }; destination d_es { elasticsearch( index("syslog-ng_${YEAR}.${MONTH}.${DAY}") type("test") cluster("syslog-ng") template("$(format-json --scope rfc3164 --scope nv-pairs --exclude R_DATE --key ISODATE)n"); ); };
  • 24. 24 syslog-ng.conf: filters, parsers filter f_nodebug { level(info..emerg); }; filter f_messages { level(info..emerg) and not (facility(mail) or facility(authpriv) or facility(cron)); }; parser pattern_db { db-parser(file("/opt/syslog-ng/etc/patterndb.xml") ); };
  • 25. 25 syslog-ng.conf: logpath log { source(s_sys); filter(f_messages); destination(d_mesg); }; log { source(s_net); source(s_sys); filter(f_nodebug); parser(pattern_db); destination(d_es); flags(flow-control); };
  • 27. 27 SCALING SYSLOG-NG  Client – Relay – Server instead of Client – Server  Distribute some of the processing to Client/Relay
  • 28. 28 LOG ROUTING  Based on filtering  Send the right logs to the right places  Message parsing can increase accuracy  E-mail on root logins  Can optimize SIEM / log analyzer tools  Only relevant messages: cheaper licensing  Throttling: evening out peaks
  • 29. 29 WHAT IS NEW IN SYSLOG-NG 3.8  Disk-based buffering  Grouping-by(): correlation independent of patterndb  Parsers written in Rust  Elasticsearch 2.x support  Curl (HTTP) destination  Performance improvements  Many more :-)
  • 30. 30 SYSLOG-NG BENEFITS FOR LARGE ENVIRONMENTS High-performance reliable log collection Simplified architecture Single application for both syslog and application data Easier-to-use data Parsed and presented in a ready-to-use format Lower load on destinations Efficient message filtering and routing
  • 31. 31 JOINING THE COMMUNITY  syslog-ng: http://syslog-ng.org/  Source on GitHub: https://github.com/balabit/syslog-ng  Mailing list: https://lists.balabit.hu/pipermail/syslog-ng/  IRC: #syslog-ng on freenode
  • 32. 32 QUESTIONS? My blog: http://czanik.blogs.balabit.com/ My e-mail: peter.czanik@balabit.com Twitter: https://twitter.com/PCzanik
  • 33. 33 SAMPLE XML ● <?xml version='1.0' encoding='UTF-8'?> ● <patterndb version='3' pub_date='2010-07-13'> ● <ruleset name='opensshd' id='2448293e-6d1c-412c-a418-a80025639511'> ● <pattern>sshd</pattern> ● <rules> ● <rule provider="patterndb" id="4dd5a329-da83-4876-a431-ddcb59c2858c" class="system"> ● <patterns> ● <pattern>Accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING:: @@ANYSTRING:usracct.service@</pattern> ● </patterns> ● <examples> ● <example> ● <test_message program="sshd">Accepted password for bazsi from 127.0.0.1 port 48650 ssh2</test_message> ● <test_values> ● <test_value name="usracct.username">bazsi</test_value> ● <test_value name="usracct.authmethod">password</test_value> ● <test_value name="usracct.device">127.0.0.1</test_value> ● <test_value name="usracct.service">ssh2</test_value> ● </test_values> ● </example> ● </examples> ● <values> ● <value name="usracct.type">login</value> ● <value name="usracct.sessionid">$PID</value> ● <value name="usracct.application">$PROGRAM</value> ● <value name="secevt.verdict">ACCEPT</value> ● </values> ● </rule>