The security industry is talking a lot about threat intelligence; external information that a company can leverage to understand where potential threats are knocking on the door and might have already perpetrated the network boundaries. Conversations with many CERTs have shown that we have to stop relying on knowledge about how attacks have been conducted in the past and start 'hunting' for signs of compromises and anomalies in our own environments.
In this presentation we explore how the decade old field of security visualization has emerged. We show how we have applied advanced analytics and visualization to create our own threat intelligence and investigated lateral movement in a Fortune 50 company.
Visualization. Data science. No machine learning. But pretty pictures.
Here is a blog post I wrote a bit ago about the general theme of internal threat intelligence:
http://www.darkreading.com/analytics/creating-your-own-threat-intel-through-hunting-and-visualization/a/d-id/1321225?
5. Security. Analytics. Insight.5
• Products / Tools
• Firewall - Blocks traffic based on pre-defined rules
• Web Application Firewall - Monitors for signs of known malicious activity in Web traffic
• Intrusion Prevention System - Looks for ‘signs’ of known attacks in traffic and protocol violations
• Anti Virus - Looks for ‘signs’ of known attacks on the end system
• Malware Sandbox - Runs new binaries and monitors their behavior for malicious signs
• Security Information Management - Uses pre-defined rules to correlate signs from different data
streams to augment intelligence
• Vulnerability Scanning - Searches for known vulnerabilities and vulnerable software
• Rely on pattern matching and signatures based knowledge from the past
• Reactive -> always behind
• Unknown and new threats -> won’t be detected
• ‘Imperfect’ patterns and rules -> cause a lot of false positives
We Are Monitoring - What is Going Wrong?
Defense Has Been Relying
On Past Knowledge
6. Security. Analytics. Insight.6
Event Funnel - How We Used To Do It
data
rule-based
correlation
prioritization
simple
statistics
attack
candidates
• What rules do you write?
• Do the vendor provided rules work for you?
• How do you define a priority 10 event?
• High false positive rate!
• Unless alerts are VERY focussed
• High false negative rate!
• Do you know what you don’t know?
7. Security. Analytics. Insight.7
Then Came Threat Intelligence
• How many hits do you really get?
• You are missing most attacks
IOCs
• How do you match
these efficiently
against a real-time
stream?
• How do you de-
duplicate and
normalize these
feeds?
attack
candidates
70–90%
OF MALWARE SAMPLES ARE UNIQUE TO AN ORGANIZATION.
8. Security. Analytics. Insight.8
Removing the Event Funnel - Hello Data Lake
any
data
Big Data Lake
Rules
• Storing more, and more diverse data
• Kafka and “dynamic parsing”
• Enabling large-scale processing
• Spark, SparkStreaming, Storm, Parquet
• Using “standard” data access (SQL, REST)
• Plug in any other tool!
context
IOCs
This per-se is not new …
9. Security. Analytics. Insight.9
Adding Interactive - Analyst Driven Exploration
any
data
Big Data Lake
Rules
context
IOCs
… but first we get the human in the loop …
Hunting
• interactive visualization
• analyst driven
• machine assisted
10. Security. Analytics. Insight.10
Hunting Creates Internal Threat Intelligence
any
data
Big Data Lake
Rules
context
IOCs
… then, let’s rethink our rules … Novel, Advanced Attacks
internal TI
11. Security. Analytics. Insight.11
Hunting Creates Internal Threat Intelligence
any
data
Big Data Lake
Rules
context
IOCs
… then, let’s rethink our rules … patterns anyone?
internal TI
Novel, Advanced Attacks
Low False Positive Alerts
Patterns
12. Security. Analytics. Insight.12
Buzzword Bingo
any
data
Big Data Lake
Rules
context
IOCs
… and finally, we are buzzword compliant …
behavioral monitoring
scoring
anomaly detection
machine learning
artificial intelligence
“models”
data science
internal TI
Patterns
13. Security. Analytics. Insight.13
How Does All That Architecture Stuff Matter?
In the following we’ll explore how this all matters …
… but first, let’s see how visualization plays a key role here.
17. Security. Analytics. Insight.17
SELECT count(distinct protocol) FROM flows;
SELECT count(distinct port) FROM flows;
SELECT count(distinct src_network) FROM flows;
SELECT count(distinct dest_network) FROM flows;
SELECT port, count(*) FROM flows GROUP BY port;
SELECT protocol,
count(CASE WHEN flows < 200 THEN 1 END) AS [<200],
count(CASE WHEN flows>= 201 AND flows < 300 THEN 1 END)
AS [201 - 300],
count(CASE WHEN flows>= 301 AND flows < 350 THEN 1 END)
AS [301 - 350],
count(CASE WHEN flows>= 351 THEN 1 END) AS [>351]
FROM flows GROUP BY protocol;
SELECT port, count(distinct src_network) FROM flows GROUP BY port;
SELECT src_network, count(distinct dest_network) FROM flows GROUP
BY port;
SELECT src_network, count(distinct dest_network) AS dn, sum(flows)
FROM flows GROUP BY port, dn;
SELECT port, protocol, count(*) FROM flows GROUP BY port, protocol;
SELECT sum(flows), dest_network FROM flows GROUP BY dest_network;
…
One Graph Summarizes Dozens of Queries
port dest_network
protocol src_network flows
19. Security. Analytics. Insight.19
We will have a look at a couple components from earlier:
• Context
• Data Science
• Clustering
• Seriation - Data Science Gone Wrong
• Time-series Analysis
Analytics Components
20. Security. Analytics. Insight.20
Did You Know?
Users accessing Sharepoint
servers
User
Sharepoint Server
data processing visualization
This graph of users accessing
sharepoint servers, does not
immediately reveal any interesting
patterns.
21. Security. Analytics. Insight.21
Did You Know - How Context Tells a Story
Using HR data as context
Remote User
San Francisco Office User
Sharepoint Server
data processing visualization
HR data
Using color to add context to the
graph helps immediately identify
outliers and potential problems.
22. Security. Analytics. Insight.22
• Simple stuff works!
• dc(dest), dc(d_port)
• What is normal?
• Use data science / data mining to prepare
data. Then visualize the output for human
analyst.
Data Science in Security - Words of Caution
23. Security. Analytics. Insight.23
Challenges With Clustering Network Traffic
The graph shows an abstract
space with colors being
machine identified clusters.
Hard Questions:
• What are these clusters?
• Do Web servers cluster?
• What are good clusters?
• What’s anomalous?
28. Security. Analytics. Insight.28
Hunting - Ready, Fire, Aim
• Analysts are your best and most expensive resource
• They need the right tools and data
• Speed (see earlier architecture)
• Interaction (visual!)
• Machine-assisted insight
Examples
• Exploring DNS traffic
• High business impact machine analysis
• Lateral movement
31. Security. Analytics. Insight.31
We have tried many thing:
• Social Network Analysis
• Seasonality detection
• Entropy over time
• Frequent pattern mining
• Clustering
All kinds of challenges.
Simple works!
Let’s Get Mathematical
U−matrix
4.28e−05
0.0461
0.0921
33. Security. Analytics. Insight.33
Lateral Movement - Cross Network Communications
Challenges
• Scale
• You will find one of everything
• Defining white-lists and
keeping them up to date (i.e.,
network and asset hygiene)
VPN
DMZ
Office
GIA
Unknown
Internet
AWS
35. Security. Analytics. Insight.35
BlackHat Workshop
Visual Analytics
Delivering Actionable Security
Intelligence
July 30,31 & August 1,2 - Las Vegas, USA
big data | analytics | visualization
http://secviz.org
36. Security. Analytics. Insight.36
After some exploration …
raffael.marty@pixlcloud.com
http://slideshare.net/zrlram
http://secviz.org and @secviz
Further resources: