Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit Data
1. IRM Summit 2014
Customer Intelligence:
Using the ELK stack (Elasticsearch,
Logstash and Kibana) to analyse
ForgeRock OpenAM audit data
warren.strange@forgerock.com
4. OpenDJ, OpenIDM,
OpenAM produce copious
amounts of audit data
Analysis of that data is left as an
exercise for the reader
Many great SIEM tools
Desire for an Open Source
solution for data analysis
5. What is the ELK stack?
Elasticsearch: “No SQL” database
Logstash: Log collection and
transformation
Kibana: Data visualizer for
Elasticsearch
6. Yes, but what does ELK do?
Collect, analyse and visualize data
Any kind of data
Github (8 Million repos), Soundcloud (30M users), The Guardian (40M documents)
Answer questions:
● Where are my users coming from?
● What is the traffic in North America vs.
Europe?
● Why do I see an employee logging in from
Canada?
7. Elasticsearch
● NoSQL, REST/json, document oriented,
schemaless, “Big data” full text search engine
● Apache 2.0 license
● Sweet spot is rapid full text search / ad hoc queries
● Not a replacement for an RDBMS
● Not transactional, not ACID, etc.
● Built on Apache Lucene project
8. Logstash
● Swiss army knife of log collection,
transformation and forwarding
● JRuby based
● Large footprint :-(
● lumberjack
● go based collector that feeds into logstash
● Very lightweight, small footprint
10. Logstash flow
Input source
files, database,
syslog, etc.
Filters
grep, regex,
geoIP, ...
Output
elasticsearch, file,
db, syslog
“Plugin” based architecture. Add new
plugins for input, output and filters
11. Logstash example
Input source
file: amAccess.*
type: amAccess
Filters
Map IP
address to
GEO location
Output
elasticsearch:9100
Read from
OpenAM access
logs
Add Geo
Location data
Write the result
to Elasticsearch
13. input {
file {
type => amAccess
path => "/logs/am/log/amAuthentication.*"
}
}
Input section
Wildcards can be
used
Data is tagged with a type.
Use this to classify & search
by type
14. filter {
if [type] == "amAccess" {
csv {
columns => [time,Data,LoginID,ContextID, IPAddr, LogLevel,
Domain, LoggedBy, MessageID, ModuleName, NameID,
HostName]
separator => " "
}
date {
match => ["time", "yyyy-MM-dd HH:mm:ss"]
}
geoip {
database => "/usr/share/GeoIP/GeoIP.dat"
source => ["IPAddr"]
}
}
}
Filter apply to type
Parse the data
as csv
Normalize the date to
a common format
Enrich the record
with GEO location
15. output {
stdout {
codec => rubydebug
}
elasticsearch { host => localhost }
}
Output
Send the data to
Elasticsearch and the
stdout
16. Demo Time
As seen on youtube!
http://youtu.be/tvrYuSLuGik
27 49 views!
18. Marketing Genius?
Where to hold the next ForgeRock Summit: Europe, USA, or Canada?
Asks you to find out pronto:
● What country are customers visiting the ForgeRock website from?
● How are they authenticating (forgerock account, or federated?)
19. The next IRM summit
location:
We have beer!
Bring your toque!
20. Next Steps
Delivery Models
Cloud or Appliance?
Interested in collaborating?
Share logstash config, kibana reports,
etc.
Puppet/Chef/Ansible/Docker installers?