Real time analysis and visualization of security events

Real time analysis
and visualization
ANUBISNETWORKS LABS
PTCORESEC
1

Agenda
 Who are we?
 AnubisNetworks Stream
 Stream Information Processing
 Adding Valuable Information to Stream Events
2

Who are we?
 Tiago Martins
 AnubisNetworks
 @Gank_101
3
 João Gouveia
 AnubisNetworks
 @jgouv
 Tiago Henriques
 Centralway
 @Balgan

Anubis StreamForce
 Events (lots and lots of events)
 Events are “volatile” by nature
 They exist only if someone is listening
 Remember?:
“If a tree falls in a forest and no one is
around to hear it, does it make a
sound?”
4

Anubis StreamForce
 Enter security Big Data
“a brave new world”
5
Volume
Variety Velocity
We are
here

Anubis StreamForce
 Problems (and ambitions) to tackle
 The huge amount and variety of data to process
 Mechanisms to share data across multiple systems,
organizations, teams, companies..
 Common API for dealing with all this (both from a
producer and a consumer perspective)
6

Anubis StreamForce
 Enter the security events CEP - StreamForce
High performance, scalable, Complex Event
Processor (CEP) – 1 node (commodity hw) = 50k
evt/second
Uses streaming technology
Follows a publish / subscriber model
7

Anubis StreamForce
 Data format
Events are published in JSON format
Events are consumed in JSON format
8

Anubis StreamForce
 Yes, we love
JSON
9

Anubis StreamForce 10
Sharing Models

MFE
OpenSource /
MailSpike community
Dashboard
Dashboard
Complex Event Processing
Sinkholes
Data-theft Trojans
Real Time Feeds
Real Time Feeds
IP Reputation
Passive DNSTraps /
Honeypots
Twitter

Anubis CyberFeed 13
 Feed galore!
Sinkhole data, traps, IP reputation, etc.
 Bespoke feeds (create your own view)
 Measure, group, correlate, de-duplicate ..
 High volume (usually ~6,000 events per
second, more data being added frequently

MFE
OpenSource /
MailSpike community
Dashboard
Event navigation
Complex Event Processing
Sinkholes
Data-theft Trojans
Real Time Feeds
Real Time Feeds
IP Reputation
Passive DNSTraps /
Honeypots
Twitter

Anubis CyberFeed 15
 Apps (demo time)

Stream Information Processing
 Collecting events from the Stream.
 Generating reports.
 Real time visualization.
16

Challenge
 ~6k events/s and at peak over 10k events/s.
 Let‟s focus on trojans feed (banktrojan).
 Peaks @ ~4k events/s
{"_origin":"banktrojan","env":{"server_name":"anam0rph.su","remote_ad
dr":"46.247.141.66","path_info":"/in.php","request_method":"POST","http
_user_agent":"Mozilla/4.0"},"data":"upqchCg4slzHEexq0JyNLlaDqX40G
sCoA3Out1Ah3HaVsQj45YCqGKylXf2Pv81M9JX0","seen":1379956636,"tr
ojanfamily":"Zeus","_provider":"lab","hostn":"lab14","_ts":1379956641}
17

Challenge
 Let‟s use the Stream to help
 Group by machine and trojan
 From peak ~4k/s to peak ~1k/s
 Filter fields.
 Geo location
 We end up with
{"env":{"remote_addr":"207.215.48.83"},"trojanfamily":"W32Expiro","_geo_env_remote_addr
":{"country_code":"US","country_name":"United States","city":"Los
Angeles","latitude":34.0067,"longitude":-118.3455,"asn":7132,"asn_name":"AS for SBIS-AS"}}
20

Challenge
 How to process and store these events?
21

Technologies 22
 Applications
 NodeJS
 Server-side Javascript Platform.
 V8 Javascript Engine.
 http://nodejs.org/
Why?
 Great for prototyping.
 Fast and scalable.
 Modules for (almost) everything.

Technologies 23
 Databases
 MongoDB
 NoSQL Database.
 Stores JSON-style documents.
 GridFS
 http://www.mongodb.org/
Why?
 JSON from the
Stream, JSON in the
database.
 Fast and scalable.
 Redis
 Key-value storage.
 In-memory dataset.
 http://redis.io/
Why?
 Faster than MongoDB for
certain operations, like
keeping track of number of
infected machines.
 Very fast and scalable.

Data Collection 24
Storage
Aggregate
information
MongoDB Redis
Worker
Worker
Worker
Processor
Process real time
events
 Applications
 Collector
 Worker
 Processor
 Databases
 MongoDB
 Redis
Collector
Stream

Data Collection 25
Storage
Aggregate
information
MongoDB Redis
Worker
Worker
Worker
Processor
Process real time
events
 Events comes from the Stream.
 Collector distributes events to Workers.
 Workers persist event information.
 Processor aggregates information and stores it for statistical and historical
analysis.
Collector
Stream

Data Collection 26
Storage
Aggregate
information
MongoDB Redis
Worker
Worker
Worker
Processor
Process real time
events
 MongoDB
 Real time information of infected machines.
 Historical aggregated information.
 Redis
 Real time counters of infected machines.
Collector
Stream

Data Collection - Collector 27
Collector
 Old data is periodically remove, i.e. machines that don‟t
produce events for more than 24 hours.
 Send events to Workers.Workers
 Decrements counters of removed information.
 Send warnings
 Country / ASN is no longer infected.
 Botnet X decreased Y % of its size.

Data Collection - Worker 28
Worker
 Create new entries for unseen machines.
 Adds information about new trojans / domains.
 Update the last time the machine was seen.
 Process events and update the Redis counters
accordingly.
 Needs to check MongoDB to determine if:
 New entry – All counters incremented
 Existing entry – Increment only the counters related to
that Trojan
 Send warnings
 Botnet X increased Y % in its size.
 New infections seen on Country / ASN.

Data Collection - Processor
Processor
29
 Processor retrieves real time counters from Redis.
 Information is processed by:
 Botnet;
 ASN;
 Country;
 Botnet/Country;
 Botnet/ASN/Country;
 Total.
 Persisting information to MongoDB creates a historic
database of counters that can be queried and
analyzed.

Data Collection - MongoDB
 Collection for active machines in the last 24h
{
"city" : "Philippine",
"country" : "PH",
"region" : "N/A",
"geo" : {
"lat" : 16.4499,
"lng" : 120.5499
},
"created" : ISODate("2013-09-21T00:19:12.227Z "),
"domains" : [
{ "domain" : "hzmksreiuojy.nl",
"trojan" : "zeus",
"last" : ISODate("2013-09-21T09:42:56.799Z"),
"created" : ISODate("2013-09-21T00:19:12.227Z") }
],
"host" : "112.202.37.72.pldt.net",
"ip" : "112.202.37.72",
"ip_numeric" : 1892296008,
"asn" : "Philippine Long Distance Telephone Company",
"asn_code" : 9299,
"last" : ISODate("2013-09-21T09:42:56.799Z"),
"trojan" : [ "zeus” ]
}
30

 Collection for aggregated information (the historic counters database)
{
"_id" : ObjectId("519c0abac1172e813c004ac3"),
"0" : 744,
"1" : 745,
"3" : 748,
"4" : 748,
"5" : 746,
"6" : 745,
...
"10" : 745,
"11" : 742,
"12" : 746,
"13" : 750,
"14" : 753,
...
"metadata" : {
"country" : "CH",
"date" : "2013-05-22T00:00:00+0000",
"trojan" : "conficker_b",
"type" : "daily"
}
}
31
Preallocated entries for each hour
when the document is created.
If we don’t, MongoDB will keep
extending the documents by adding
thousands of entries every hour and it
becomes very slow.

 Collection for 24 hours
 4 MongoDB Shard instances
 >3 Million infected machines
 ~2 Gb of data
 ~558 bytes per document.
 Indexes by
 ip – helps inserts and updates.
 ip_numeric – enables queries by CIDRs.
 last – Faster removes for expired machines.
 host – Hmm, is there any .gov? 
 country, family, asn – Speeds MongoDB
queries and also allows faster custom
queries.
 Collection for aggregated information
 Data for 119 days (25 May to 11 July)
 > 18 Million entries
 ~6,5 Gb of data
 ~366 bytes per object
 ~56 Mb per day
 Indexes by
 metadata.country
 metadata.trojan
 metadata.date
 Metadata.asn
 Metadata.type,
metadata.country,metadata.date,met.......
(all)
32

Data Collection - Redis
 Counters by Trojan / Country
"cutwailbt:RO": "1256",
"rbot:LA": "3",
"tdss:NP": "114",
"unknown4adapt:IR": "100",
"unknownaff:EE": "0",
"cutwail:CM": "20",
"unknownhrat3:NZ": "56",
"cutwailbt:PR": "191",
"shylock:NO": "1",
"unknownpws:BO": "3",
"unknowndgaxx:CY": "77",
"fbhijack:GH": "22",
"pushbot:IE": "2",
"carufax:US": "424“
 Counters by Trojan
"unknownwindcrat": "18",
"tdss": "79530",
"unknownsu2": "2735",
"unknowndga9": "15",
"unknowndga3": "17",
"ircbot": "19874",
"jshijack": "35570",
"adware": "294341",
"zeus": "1032890",
"jadtre": "40557",
"w32almanahe": "13435",
"festi": "1412",
"qakbot": "19907",
"cutwailbt": "38308“
 Counters by Country
“BY": "11158",
"NA": "314",
"BW": "326",
"AS": "35",
"AG": "94",
"GG": "43",
"ID": "142648",
"MQ": "194",
"IQ": "16142",
"TH": "105429",
"MY": "35410",
"MA": "15278",
"BG": "15086",
"PL": "27384”
33

Data Collection - Redis
 Redis performance in our machine
 SET: 473036.88 requests per second
 GET: 456412.59 requests per second
 INCR: 461787.12 requests per second
 Time to get real time data
 Getting all the data from Familys/ASN/Counters to the NodeJS application and ready to
be processed in around half a second
 > 120 000 entries in… (very fast..)
 Our current usage is
 ~ 3% CPU (of a 2.0 Ghz core)
 ~ 480 Mb of RAM
34

Data Collection - API
 But! There is one more application..
 How to easily retrieve stored data
 MongoDB Rest API is a bit limited.
 NodeJS HTTP + MongoDB + Redis
 Redis
 http://<host>/counters_countries
 ...
 MongoDB
 http://<host>/family_country
 ...
 Custom MongoDB Querys
 http://<host>/ips?f.ip_numeric=95.68.149.0/22
 http://<host>/ips?f.country=PT
 http://<host>/ips?f.host=bgovb
35

Data Collection - Limitations
 Grouping information by machine and trojan doesn‟t allow to
study the real number of events per machine.
 Can be useful to get an idea of the botnet operations or how many
machines are behind a single IP (everyone is behind a router).
 Slow MongoDB impacts everything
 Worker application needs to tolerate a slow MongoDB and discard some
information has a last resort.
 Beware of slow disks! Data persistence occurs every 60 seconds (default)
and can take too much time, having a real impact on performance..
 >10s to persist is usually very bad, something is wrong with hard drives..
36

Data Collection - Evolution
 Warnings
 Which warnings to send? When? Thresholds?
 Aggregate data by week, month, year.
 Aggregate information in shorter intervals.
 Data Mining algorithms applied to all the collected information.
 Apply same principles to other feeds of the Stream.
 Spam
 Twitter
 Etc..
37

Reports
 What‟s happening in country X?
 What about network 192.168.0.1/24?
 Can send me the report of Y everyday at 7 am?
 Ohh!! Remember the report I asked last week?
 Can I get a report for ASN AnubisNetwork?
38

Reports 39
 HTTP API
 Schedule
 Get
 Edit
 Delete
 List schedules
 List reports
 Check MongoDB for work.
 Generate CSV report or store the JSON Document for
later querying.
 Send email with link to files when report is ready.
Server
Generator

Reports – MongoDB CSVs
 Scheduled Report
{
"__v" : 0,
"_id" : ObjectId("51d64e6d5e8fd0d145000008"),
"active" : true,
"asn_code" : "",
"country" : "PT",
"desc" : "Portugal Trojans",
"emails" : "",
"range" : "",
"repeat" : true,
"reports" : [
ObjectId("51d64e7037571bd24500000d"),
ObjectId("51d741e8bcb161366600000c"),
ObjectId("51d89367bcb161366600005f"),
ObjectId("51d9e4f9bcb16136660000ca"),
ObjectId("51db3678c3a15fc577000038"),
ObjectId("51dc87e216eea97c20000007"),
ObjectId("51ddd964a89164643b000001")
],
"run_at" : ISODate("2013-07-11T22:00:00Z"),
"scheduled_date" : ISODate("2013-07-
05T04:41:17.067Z")
}
 Report
{
"__v" : 0,
"_id" : ObjectId("51d89367bcb161366600005f"),
"date" : ISODate("2013-07-06T22:00:07.015Z"),
"files" : [
ObjectId("51d89368bcb1613666000060")
],
"work" : ObjectId("51d64e6d5e8fd0d145000008")
}
 Files
 Each report has an array of files that
represents the report.
 Each file is stored in GridFS.
40

Reports – MongoDB JSONs
 Scheduled Report
{
"__v" : 0,
"_id" : ObjectId("51d64e6d5e8fd0d145000008"),
"active" : true,
"asn_code" : "",
"country" : "PT",
"desc" : "Portugal Trojans",
"emails" : "",
"range" : "",
"repeat" : true,
“snapshots" : [
ObjectId("521f761c0a45c3b00b000001"),
ObjectId("521fb0848275044d420d392f"),
ObjectId("52207c2f7c53a8494f010afa"),
ObjectId("5221c9df4910ba3874000001"),
ObjectId("522275724910ba3874001f66"),
ObjectId("5223c6f24910ba3874003b7a"),
ObjectId("522518734910ba3874005763")
],
"run_at" : ISODate("2013-07-11T22:00:00Z"),
"scheduled_date" : ISODate("2013-07-05T04:41:17.067Z")
}
 Snapshot
{
"_id" : ObjectId("51d89367bcb161366600005f"),
"date" : ISODate("2013-07-06T22:00:07.015Z"),
"work" : ObjectId("521f761c0a45c3b00b000001"),
count: 123
}
 Results
{
"machine" : {
"trojan" : [ “conficker_b“ ],
"ip" : "2.80.2.53",
"host" : "Bl19-1-13.dsl.telepac.pt",
}, …
, "metadata" : {
"work" : ObjectId("521f837647b8d3ba7d000001"),
"snaptshot" : ObjectId("521f837aa669d0b87d000001"),
"date" : ISODate("2013-08-29T00:00:00Z")
},
}
41

Reports – Evolution
 Other reports formats.
 Charts?
 Other type of reports. (Not only botnets).
 Need to evolve Collector first.
42

Globe
 How to visualize real time events from the stream?
 Where are the botnets located?
 Who‟s the most infected?
 How many infections?
43

Globe – Stream
 origin = banktrojan
 Modules
 Group
 trojanfamily
 _geo_env_remote_addr.country_n
ame
 grouptime=5000
 Geo
 Filter fields
 trojanfamily
 Geolocation
 _geo_env_remote_addr.l*
 KPI
 trojanfamily
 _geo_env_remote_addr.country_n
ame
 kpilimit = 10
44
Stream NodeJS Browser
 Request botnets from stream

Globe – NodeJS 45
 NodeJS
 HTTP
 Get JSON from Stream.
 Socket.IO
 Multiple protocol support (to bypass some proxys and handle
old browsers).
 Redis
 Get real time number of infected machines.

Globe – Browser 46
 Browser
 Socket.IO Client
 Real time apps.
 Websockets and other
types of transport.
 WebGL
 ThreeJS
 Tween
 jQuery
 WebWorkers
 Runs in the background.
 Where to place the red dots?
 Calculations from geolocation
to 3D point goes here.

Globe – Evolution
 Some kind of HUD to get better interaction and notifications.
 Request actions by clicking in the globe.
 Generate report of infected in that area.
 Request operations in a specific that area.
 Real time warnings
 New Infections
 Other types of warnings...
47

Adding Valuable Information to
Stream Events
 How to distribute workload to other machines?
 Adding value to the information we already have.
48

Minions
 Typically the operations that would had value
are expensive in terms of resources
 CPU
 Bandwidth
 Master-slave approach that distributes work
among distributed slaves we called Minions.
49
Master
Minion
Minion
Minion
Minion

Minions 50
 Master receives work from Requesters and store the work in MongoDB.
 Minions request work.
 Requesters receive real time information on the work from the Master or
they can ask for work information at a later time.
Process / Storage Minions
Master MongoDB
DNS
Scan
Minion
Minion
Requesters
Minion

Minions
 Master has an API that allows custom Requesters to ask for
work and monitor the work.
 Minion have a modular architecture
 Easily create a custom module.
 Information received from the Minions can then be
processed by the Requesters and
 Sent to the Stream
 Saved on the database
 Update existing database
51
Minion
DNS
Scanning
Data
Mining

Extras...
 So what else could we possibly do using the Stream?
 Distributed Portscanning
 Distributed DNS Resolutions
 Transmit images
 Transmit videos
 Realtime tools
 Data agnostic. Throw stuff at it and it will deal with it.
52

Extras...
 So what else could we possibly do using the Stream?
 Distributed Portscanning
 Distributed DNS Resolutions
 Transmit images
 Transmit videos
 Realtime tools
 Data agnostic. Throw stuff at it and it will deal with it.
53
FOCUS
FOCUS

Portscanning
 Portscanning done right…
 Its not only about your portscanner being able to throw 1 billion
packets per second.
 Location = reliability of scans.
 Distributed system for portscanning is much better. But its not just
about having it distributed. Its about optimizing what it scans.
54

Portscanning
IP Australia
(intervolve)
China
(ChinaVPShosting)
Russia
(NQHost)
USA
(Ramnode)
Portugal
(Zon PT)
41.63.160.0/19
(Angola)
0 hosts up 0 hosts up 0 hosts up 0 hosts up 3 hosts up
(sometimes)
5.1.96.0/21
(China)
41.78.72.0/22
(Somalia)
92.102.229.0/24
(Russia)
58

Portscanning problems...
 Doing portscanning correctly brings along certain problems.
 If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time
59

60

 You need lots of minions in different parts of the world
 Doesn‟t actually require an amazing CPU or RAM if you do it correctly.
 Storing all that data...
 Querying that data...
Is it possible to have a cheap, distributed portscanning
system?
61

Portscanning problems... 62
Minion

If we„re doing it... Anyone else can.
Evil side?
68

Anubis StreamForce
 Have cool ideas? Contact us
 Access for Brucon participants:
API Endpoint:
http://brucon.cyberfeed.net:8080/stream?key=brucon
2013
 Web UI Dashboard maker:
http://brucon.cyberfeed.net:8080/webgui
69

Lol
 Last minute testing
70

Real time analysis and visualization of security events

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (13)

Similar a Real time analysis and visualization of security events

Similar a Real time analysis and visualization of security events (20)

Más de Tiago Henriques

Más de Tiago Henriques (20)

Último

Último (20)

Real time analysis and visualization of security events

Notas del editor