08448380779 Call Girls In Friends Colony Women Seeking Men
Performing network security analytics
1. Performing
Network
&
Security
Analy7cs
with
Hadoop
Travis Dawson
Director of Product Management Narus, Inc
Hadoop Summit 2012
2. Agenda
§ Who am I, What do I do
§ What is Network & Security Analytics
§ Using Hadoop in Network & Security Analytics
§ What becomes possible with Big Data Analytics
§ Putting it all together
§ Lessons Learned
Narus
|
2
3. Who
am
I
What
do
I
do
§ Geek
§ Director of Product Management, Narus Inc
– Narus Inc, A wholly owned subsidiary of Boeing
– Build High Performance Network Intelligence Systems
– I herd cats and make Powerpoints all day
– Occasionally think about product requirements
§ Principal Member Technical Staff, Sprint
– Sprint Advanced Technology Labs
– Wireline/Wireless Network Architecture, Design, Security
– I broke stuff
Narus
|
3
4. What
is
Network
&
Security
Analy7cs
A
type
of
voodoo,
but
with
computers
The (black) art of finding malicious or problematic
sessions in a mountain of network traffic
§ Multiple approaches
– Signatures/Blacklists
– Behavior
– Algorithmic
– Ouiji Board, Live Chicken, Full Moon, etc
§ Single Goal
– Identify malicious or problematic traffic before it causes
substantial harm to your network or your assets.
Narus
|
4
5. Network
&
Security
Analy7cs
What’s
working
against
you
The enemy is ever-changing and infinitely intelligent
§ New attack vectors are more difficult to detect than ever
– Polymorphic, Randomized
– APTs are real
– Zero-Days
– Protocol, Application, OS
§ Traditional Methods in-effective
– Payloads ever changing
– Simply too many new and existing
§ Higher speeds of links makes deeper analysis harder
– 10G/sec maxes out at ~15M packets per second
Narus
|
5
6. What
is
Network
&
Security
Analy7cs
Finding
the
Needle
in
a
stack
of
Needles
§ Where to look
– Which stack of Needles do I need to look at
§ What are you looking for
– Do you know?
– Are you guessing?
– Do you know what you are NOT looking for?
§ How to find something that is not ‘right’
– What is ‘right’, what is ‘not-right’, what is ‘wrong’?
– What is the difference?
– What is ‘normal’ vs what is ‘right’ ?
– How much data do you need ?
Narus
|
6
7. What
is
Network
&
Security
Analy7cs
Solving
the
Network
&
Security
Analy7cs
Problem
Multiple Methods, Multiple Algorithms, Multiple
Passes Per Analytic
§ You need a lot of data to determine what is ‘not-right’
– More data == More accurate results
§ You need to run sophisticated algorithms across the data
– Use new algorithms to find something ‘not-right’
– Not always easy
§ You need multiple passes on the data
– One Algorithm feeds the next Algorithm
– Focus on the workflow, how an analyst would work.
Narus
|
7
8. Breaking
out
of
the
SQL
Prison
A
quick
rant
§ SQL has been around since the 70’s
– So have I!
– Great for solving ‘known’ problems
§ Unable to perform the deep analytics required
– No combination of SELECT, JOIN, UDF will get you what you
need at times
– Unstructured data is a nightmare and now more common
§ However, use of one tool does not mean you can’t use
another tool as well
– SQL and Hadoop can live very happily together
– The right tool for the right job, or more precisely:
• The right tool for the right PART of the job
Narus
|
8
9. Network
&
Security
Analy7c
Using
Hadoop
to
solve
the
hard
problems
§ Amount of Data
– 1 week -> 1 Month+ of data: 100’s of Billions of Sessions, 100’s
of TB’s of Data, ingesting dozens of data types and millions of
sessions per hour
§ Algorithms
– Looking for sessions that look something like this thing or maybe
unlike this other thing. You can do that right???
§ Unstructured
– We have no idea what we are going to get in terms of
information
§ Price per Analytic Hour
– How much does it cost to run this analytic in a set amount of
time
Narus
|
9
10. Network
&
Security
Analy7c
A
Simple
Workflow
Example
Find a Polymorphic BotNet/Worm infection vector
§ Find the suspected infected hosts
– Clustering/Behavior/Signatures to find possible bots and worms
§ Find the Command & Control
– From list of suspects, who are the most popular ‘servers’
§ Find ALL of the possible infections
– From C&C servers, what hosts were communicated with
– Cluster and group similar hosts to find even more
§ Find the Infection Vector
– From all the suspect hosts, cluster hosts by common Application
‘features’ and traffic patterns
You need a LOT of data and it’s non-deterministic
Narus
|
10
11. Network
&
Security
Analy7c
Workflow
details
What Makes This Work
§ Hadoop Tools/Methods Used
– Entropy, FFT, Behavior Jobs
– Mahout (Clustering and Machine Learning)
– Custom Clustering (Hourglass Co-Clustering)
– Custom Correlation
§ Other Tools Used
– Streaming Classification/Statistics Engine
– RDBMS
– Visualization Front End
Narus
|
11
12. Network
&
Security
Analy7c
In
real
life
Many
tools
enabling
each
other
I
need
to
I
know
I
don’t
know
I
need
to
I
need
to
view
capture
the
what
I
am
what
I
am
organize
the
the
findings
traffic
looking
for
looking
for
findings
logically
Datasets
Deep
Summary
Packets
Metadata
Streaming
Analysis
Shallow
Views
Capture
Analysis
Hadoop
Analysis
RDBMS
Narus
|
12
13. Lessons
learned
How
we
learned
to
make
it
all
work
§ Don’t use a hammer when you need a scalpel
– It just doesn’t work, don’t force it.
– If there is a better way of doing it, use that way
§ Hadoop does a lot of things really well
– Complicated algorithms over vast amounts of data
– Unstructured Data
§ Hadoop does some things really poorly
– Low Latency results for visualization
– Simple Statistics and some groupings
§ Use Hadoop in conjunction with other tools
– Use the best tool for the job.
– Break the job into pieces and evaluate the tools for each piece
Narus
|
13
14. Conclusion
Hadoop
as
a
pla^orm
for
Network
Security
Analy7cs
§ Hadoop has allowed us to solve problems for our
customers that were previously unsolvable in a
reasonable amount of time
§ New algorithms and analytics were made possible by
Hadoop
§ By using Hadoop in conjunction with our Streaming
Engine and an RDBMS we were able to create a system
that performed better then just the sum of its parts.
§ We are now able to scale into larger datasets and extract
even better insights then before
§ No longer confined by any tool, we leverage the power of
Hadoop to solve many of our problems
Narus
|
14