Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Designing IA for AI - Information Architecture Conference 2024
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar Budget (Juan Berner & Hernan Costante)
1. Security Monitoring for big Infrastructures
without a Million Dollar budget
o
Monitoring like the NSA (con precios
cuidados)
#eko10
2. About us
● Juan Berner
○ @89berner
○ Hobbies = ['Movies/Series','Reading','Programming']
○ Mostly Blue Team
○ http://secureandscalable.wordpress.com/
● Hernán Costante
○ @hachedece
○ Security Monitoring & Incident Response fan
○ Open Source lover ♥
3. About MercadoLibre
● Devops culture (everyone and their mothers can access the boxes)
● Hybrid Cloud of Openstack & Others (servers being destroyed constantly)
● Infrastructure as a service
● Database as a service
● Database servers > 1K && Servers > 15K
● Daily logs > 100GB (and growing)
4. What is this talk about?
● ELK (Elasticsearch - Logstash - Kibana)
● Controlling the infrastructure that supports it
● Monitoring at scale with open source tools
5. Outline
● Introduction to Monitoring
● How it used to be
● Background
● Implementation
● Demo
● Outro
7. Monitoring helps in
● Fulfilling compliance (PCI, SOX, BACEN, HIPAA, BCRA, etc)
● Not just trusting your audits (what happens in the mean time?)
● Crucial for Incident Response
● Know how your infrastructure works (you can’t protect what you don’t know
is there)
8. Some Warnings
● This talk is not an offensive talk (no 0days
coming up)
● Being free does not mean it has no cost
● You will need to invest in training your staff to
handle the infrastructure
● Your only limit is what you can build around it
9. What we mean is
● We will talk about a LOT of open source solutions
● Every setup can be different (choose what helps your environment)
● > 30k lines of code supporting the infrastructure (ruby, python, node.js and
go mostly)
● You will do most of the support but will not be limited by a vendor
● google -> irc -> mailing lists
10. We will talk about the old security
monitoring for just a moment
11. The old monitoring paradigm
● A lot of limitations
○ Limited storage
○ Only security logs
○ Select and filter inputs…
○ Regex everywhere: lifestyle & nightmares
○ Relational databases for storage
12. The old monitoring paradigm (2)
● Commercial SIEMs
○ Expensive
○ Hard & soft closed
○ Inflexible
○ Licenses & support & professional services ($$$)
○ You are learning about a product
○ Being a Gartner’s Magic Quadrant Leader doesn’t
resolve security incidents
19. New security monitoring paradigm
● Ask for your logs in huge amounts of data at any time
● Get fast responses
● Log absolutely everything... even the network flows
● Contextualization
● Behavior analysis & historical comparisons
● Holistic visualization
● Metadata (tags)
22. How we Implement it
● ELK (Elasticsearch - Logstash - Kibana)
● Archiving with Hadoop and Block Storage
● Centralized reporting tool
● Our own system to control our infrastructure
● A custom monitoring tool
23. Some Inputs
● Server logs
● Firewalls
● User activity
● WAF
● Databases
● Netflow
● Load Balancers
● DNS
● Honeypots
● Sflow
● IDS
● IPS
● Switches
● Routers
● Applications
● Storage
● Openldap
● Cloud logs
● etc..
If it can log, you can collect it.
24. Delivery - Shipper - Broker - Tagging - Storage
Delivery
● syslog, syslog-ng, rsyslog, nxlog, lumberjack
● Centralization all of the logs in one place
● Not just for shipping, you will need to keep them
● Consider some redundancy for fail over
● Not the same as shipping
25. Delivery - Shipper - Broker - Tagging - Storage
Meet the event
An sflow event:
Oct 23 18:59:40 my-host sflow: FLOW,10.10.10.10,137,0,0020cbba0000,
00003e001111,0x0800,1,1,23.23.109.234,172.10.10.10,6,0x00,45,12345,80,0
x18,336,318,1600
26. Delivery - Shipper - Broker - Tagging - Storage
Shipper
We are here!
The Logstash Book Version: v1.4.2.1
27. Delivery - Shipper - Broker - Tagging - Storage
Logstash
● Great as a shipper or indexer
● Awesome community and flexibility
● Allows tagging, metrics, hundreds of inputs and outputs
● Lots of codecs for encoding/decoding input/output
● You can generate actions based on events
28. Delivery - Shipper - Broker - Tagging - Storage
Broker
The Logstash Book Version: v1.4.2.1
We are here!
29. Delivery - Shipper - Broker - Tagging - Storage
Broker
● We use Redis, but there are other options
● Allows for a better parallelization of event indexing
● At least 2 nodes for redundancy
● Buffer in case of failure (size the ram accordingly)
30. Delivery - Shipper - Broker - Tagging - Storage
Tagging
The Logstash Book Version: v1.4.2.1
We are here!
31. Delivery - Shipper - Broker - Tagging - Storage
Logstash Inputs
● How to get events to logstash
● Many different plugins to use
● Lumberjack -> Logstash default shipper
● In this case the redis input is enough
input {
redis {
host => "10.0.0.1"
type => "redis-input"
data_type => "list"
key => "logstash" } }
32. Delivery - Shipper - Broker - Tagging - Storage
Logstash Filters
● They can help you parse, tag and modify
events on the fly
● GROK => Replacing regex with names
● You can build your own custom GROK
patterns
● Other useful filters such as Metrics,
Geoip, DNS, Anonymize, Date, etc..
filter {
grok {
pattern => "%
{SYSLOGTIMESTAMP:date}...%
{HOSTNAME:srcip},%{HOSTNAME:
dstip}...%{NUMBER:srcport},%{NUMBER:
dstport}..."
}
geoip {
source => "dstip"
target => "dst_geo"
fields => ["country_code2"]
}
dns {
resolve => [ "@dns"]
action => "replace"
}
}
33. Delivery - Shipper - Broker - Tagging - Storage
Logstash Outputs
● Most famously elasticsearch
● tcp, exec, email, statsd, s3..
output{
elasticsearch_http
{
index => "logstash-%{+yyyy-MM-dd}-%{type}"
host => "localhost"
flush_size => 5000
workers => 5
}
}
● Can be used to spawn alerts (send me an email when a user logs in)
● Different outputs based on the type is possible
35. Delivery - Shipper - Broker - Tagging - Storage
Storage
The Logstash Book Version: v1.4.2.1
We are here!
36. Delivery - Shipper - Broker - Tagging - Storage
Elasticsearch
● JSON data store built on top of Apache Lucene
● Documents divided in indices, and those in shards
● Allows replication and scales amaizingly!
● Search Billions of records in seconds
● Great support for ELK
37. Delivery - Shipper - Broker - Tagging - Storage
Elasticsearch for Bulk Indexing
● We are talking of hundreds of millions of events per day
● Daily or hourly indices, increase refresh time
● Watch out for the bulk thread pool and caches
● Give most of the ram to the jvm
● Every setup is different
39. Delivery - Shipper - Broker - Tagging - Storage
Elasticsearch Security
● Insecure by default (slowly changing)
● Jetty or elasticsearch-http-basic plugins
● Nginx or node.js proxy in front of kibana
(and log all the requests)
● Segmentation is the best bet yet to secure
the cluster
40. Delivery - Shipper - Broker - Tagging - Storage
What Elasticsearch is not for
● Not a primary data store
● There are no transactions, you might lose some data
● Few tools to help with reporting besides kibana
● Not stable enough (yet)
41. Delivery - Shipper - Broker - Tagging - Storage
Backup
● Filesystem replicas (hardware problems)
● Filesystem snapshots (human mistakes)
● External backup of your raw logs (total disaster)
● Int/Ext backup of you ES indices (to avoid reindexing)
42. Delivery - Shipper - Broker - Tagging - Storage
Archiving
● Hadoop
○ Open source!
○ Process large amounts of data
○ Distributed process & storage
○ Highly scalable (linearly) & fault tolerant
○ SQL language (with Hive or Impala)
● Excellent to store all our data in a queryable way!
43. Visualization
● Kibana!
● User browser connects to ES
● Charts / geo / details / etc
● Click to browse logs
● Timelines
● “Google” your logs
44. Visualization (2)
● For cluster state
○ ElasticHQ (free)
○ Marvell (commercial)
45. Reporting
● Avoid crons
● Hadoop is better than ElasticSearch for reporting
48. Prepare for failure
● Skitter
○ Most components will fail sometimes
○ Don’t just alert. Fix it if possible.
○ Sometimes you can just check the end of the flow.
○ If you are not controlling it, you can’t depend on it.
49. Alerts
● Inline
○ Attaching to the logs (Logstash / Syslog-ng)
○ Less flexibility
○ As you grow your correlation will decrease
● Batch
○ “Near real time”
○ The power of elasticsearch at your disposal
○ Great correlation capabilities (has this
happened in the last 6 months?)
○ Creating rules for behaviour not actions
50. Alerts
● Weaver
○ Modular approach
○ Tie behaviour from multiple sources
○ What would a hacker do? (nmap|nc) & cat /etc/passwd = Alert
○ Reduce false positives with statistics
○ There are services that can call you!
51. Example of an Alert (1)
● We look for connections to countries outside AR for this period of time
{ "query":{ "filtered":{ "query":{ "match_all":{ } }, "filter":{ "and":[ {
"bool":{ "must":{ } }, "should":{ },
"must_not":{
"regexp": { "country_code2":"AR" }
} } }, { "range":{
"@timestamp":{
"from":"2014-10-12T12:20:45-03:00",
"to":"2014-10-12T12:26:45-03:00" } } } ] } } } } }
52. Example of an Alert (2)
● Guess who we found:
{
"_index":"logstash-2014-10-23-sflow",
"_type":"sflow", "_id":"JKWMv9J2T767IjxyasWjZw", … "_source":{
….
"srcip":"172.10.10.10", "dstip":"23.23.80.130", "dns":"
ekoparty.org", ...
"dst_geo":{ "country_code2":"US" } },
"sort":[ 1414105180000 ] }
53. Example of an Alert (3)
● We check if this connection has happened in the last 3 months
{ "query":{ "filtered":{ "query":{ "match_all":{ } }, "filter":{ "and":[ {
"bool":{
"must":{ "srcip":"172.10.10.10”,"dstip":"23.23.80.130" }
}, "should":{ }, "must_not":{ } } }, {
"range":{
"@timestamp":{
"from":"2014-07-12T12:19:45-03:00",
"to":"2014-10-12T12:26:45-03:00" } } } ] } } } } }
54. Example of an Alert (4)
● Our result is:
[] => Nothing
55. Example of an Alert (5)
● We now check what users and commands happened in that timeframe in that
server for evidence to attach to the alert
{ "query":{ "filtered":{ "query":{ "match_all":{ } }, "filter":{ "and":[ {
"bool":{ "must":{ } }, "should":{ },
"must_not":{
"regexp": { “host”:”172.10.10.10” }
} } }, { "range":{
"@timestamp":{
"from":"2014-10-12T12:20:45-03:00",
"to":"2014-10-12T12:26:45-03:00" } } } ] } } } } }
56. Example of an Alert (6)
● We find different users and commands and we don’t alert since a user from
the group networking had a command which includes as the argument the
address resolved by the dns filter :
{ ....
"xhost": "54.191.133.118",
"realuser": "web",
"group": "apache",
"command": "ls"
}
{ ....
"xhost": "54.191.133.118",
"realuser": "net",
"group": "networking",
"command": "wget http://www.ekoparty.
org/charlas-2014.php?
a=2014&c=green&m=176" }