Big Data for CyberSecurity

© 2013 IBM CorporationMay 14, 2013
Big Data for CyberSecurity
Anand Ranganathan
Research Staff Member, TJ Watson Research Center
<arangana@us.ibm.com>

Agenda
 Cyber Threats
 IBM Big Data Suite
 Big Data Analytics for CyberSecurity
– Monitor Network Behaviors to detect known and unknown cyber-threats
in Enterprises
– Detect Denial of Service Attacks in large ISPs
– Detect Data-Leakage from organizations
2IB

Cyber-Threats Are Becoming More Sophisticated
3

2011: Year of the Targeted Attack
Source: IBM X-Force®
Research 2011 Trend and Risk Report
JK2012-04-26
Marketing
Services
Online
Gaming
Online
Gaming
Online
Gaming
Online
Gaming
Central
Government
Gaming
Gaming
Internet
Services
Online
Gaming
Online
Gaming
Online
Services
Online
Gaming
IT
Security
Banking
IT
Security
Government
Consulting
IT
Security
Tele-
communic
ations
Enter-
tainment
Consumer
Electronics
Agriculture
Apparel
Insurance
Consulting
Consumer
Electronics
Internet
Services
Central
Govt
Central
Govt
Central
Govt
Attack Type
SQL Injection
URL Tampering
Spear Phishing
3rd
Party Software
DDoS
SecureID
Trojan Software
Unknown
Size of circle estimates relative impact of
breach in terms of cost to business
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Entertainment
Defense
Defense
Defense
Consumer
Electronics
Central
Government Central
Government
Central
Government
Central
Government
Central
Government
Central
Government
Central
Government
Consumer
Electronics
National
Police
National
Police
State
Police
State
Police
Police
Gaming
Financial
Market
Online
Services
Consulting
Defense
Heavy
Industry
Entertainment
Banking
2011 Sampling of Security Incidents by Attack Type, Time and Impact
conjecture of relative breach impact is based on publicly disclosed information regarding leaked records and financial losses

2012: The explosion of breaches continues!
Source: IBM X-Force®
Research 2012 Trend and Risk Report
2012 Sampling of Security Incidents by Attack Type, Time and Impact
Conjecture of relative breach impact is based on publicly disclosed information regarding leaked records and financial losses

A Denial of Service attack that prevents or impairs the use of networks,
systems, or applications by exhausting resources
Malware infection - A virus, worm, Trojan horse, or other code-based
malicious entity that successfully infects a host
A targeted, advanced attack – also known as an advanced persistent
threat (APT) - which is designed to be undetectable.
Loss or theft of technology (laptops, memory sticks, PDAs) which
contain sensitive data; Inadvertent disclosure of data
Defacement - A person gains logical or physical access without
permission and defaces a Web application
Common Cyber Security Risks and Potential Impacts
Loss of Customers
Impact to Brand
Sensitive Data Disclosure
Stolen Intellectual Property
Loss of Data & Productivity
Personal and National Security
Common Security Risks Potential Impacts
Loss of Data or Productivity

Botnets
 Botnet = A network of compromised computers controlled by
the botmaster, ranging in size from hundreds to millions of hosts
 Purpose: denial of service attacks, spam delivery, stealing
credentials and data, compromising control systems, etc.
 Hosts infected by downloads from malicious websites, emailed
executables, web, memory stick, PDF, …
 Bots receive updates and commands from the Command and
Control node and communications are becoming more
sophisticated
7

Botnet Communication
There is need to talk:
 Bots receive updates and
commands from the C&C
node
 Utilize a command and
control structure, through
IRC, HTML, SSL, Twitter, IM
or custom built solutions.
 Botnet communications are
becoming more
sophisticated and harder to
track
– peer-to-peer, distributed vs.
hierarchical control structure
– fast fluxing, name generation
8
C&C
P2P

A Typical Threat Example
9
2
Malicious Web
server sends or
reflects exploit code
<click>
1
Install Malware
Mail-Client
5
Victim
Domain
Name
Server
Spammer
Command
& Control
4 web-page +
3 Follow link
Execute (Spam..)
9
C&C
/ U
pdater IP
Address
Lookup
C
&C
/ U
pdater D
N
6
Remotely Control
Malware
Contact Updater
By IP Address (C&C)7
8

10
2
Malicious Web
server sends or
<click>
1
Install Malware
Mail-Client
5
Victim
Domain
Name
Server
Spammer
Command
& Control
4 web-page +
3 Follow link
Execute (Spam..)
9
C&C
/ U
pdater IP
Address
Lookup
C
&C
/ U
pdater D
N
6
Remotely Control
Malware
Contact Updater
8
d) Monitor Web Traffic
a) Monitor DNS
c) Monitor Port &
Protocol Usage
b) Monitor NetFlowb) Monitor NetFlow

Typical Solution Architecture
11
01/11/10
DNS
NetFlow
…..
X86
Box
X86
Blade
Cell
Blade
X86
Blade
FPGA
Blade
Operating System
TransportSystem S Data Fabric
Unsupervised Real-Time AnalyticsUnsupervised Real-Time Analytics Supervised LearningSupervised Learning
Dashboarding /
Visualization
1
3
2
Real-time Results
(Tickets, Monitoring)
Collect Results +
Evidence
Trends, History
4 Adapted Analytics Models
• Cybersecurity Analytics
• Real-Time processing
of massive data streams
• Advanced Data Mining,
and Trend analytics
• New and Incremental
model learning
PureData System for
Analytics, BigInsights

IBM Confidential © 2012 IBM Corporation12
Smarter Communications
BI /
Reporting
BI / Reporting Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM Big Data Platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
IBM Big Data Suite

IBM InfoSphere Streams
Millions
of
events
per
second
Microse
cond
Latency
Traditional / Non-
traditional
data sources
Real time delivery
Powerful
Analytics
Algo
Trading
Telco churn
predict
Smart
Grid
Cyber
Security
Government /
Law enforcement
ICU
Monitoring
Environment
Monitoring
A Platform for Real Time Analytics on BIG Data
Volume
Terabytes per second
Petabytes per day
Variety
All kinds of data
All kinds of analytics
Velocity Insights in microseconds
Agility
Dynamically responsive
Rapid application development

 continuous ingestion  continuous analysis
How Streams Works
achieve scale by
partitioning applications into components

 continuous ingestion
 continuous analysis
achieve scale
by partitioning applications into components
by distributing across stream-connected hardware nodes
How Streams Works
infrastructure provides services for
scheduling analytics across h/w nodes
establishing streaming connectivity
…
TransformTransform
FilterFilter
ClassifyClassify
CorrelateCorrelate
AnnotateAnnotate
where appropriate,
elements can be “fused” together
for lower communication latencies

Security Appliances (Firewalls, IDS, IPS, SIEMs)
vs Big Data
IBM Big Data PlatformIBM QRadar Security Intelligence Platform
Security use cases Turnkey Custom
User Interface All-in-one console Purpose-built applications
Data Sources 450+ preconfigured (and growing) Everything else
Data Volume 100+ Terabyte range Peta-byte range
Real-time Analysis Seconds Milliseconds
Analytics Pre-built, primarily rule-based Custom, learning
Required Expertise Average - Security practitioners Skilled – Data scientists and analysts
InfoSphere BigInsights,
Streams and PureData
for Analytics

Organizations have a growing need to identify and protect
against threats by building insights from broader and
larger data sets

20
2
Malicious Web
server sends or
<click>
1
Install Malware
Mail-Client
5
Victim
Domain
Name
Server
Spammer
Command
& Control
4 web-page +
3 Follow link
Execute (Spam..)
9
C&C
/ U
pdater IP
Address
Lookup
C
&C
/ U
pdater D
N
6
Remotely Control
Malware
Contact Updater
8
d) Monitor Web Traffic
a) Monitor DNS
c) Monitor Port &
Protocol Usage
b) Monitor NetFlowb) Monitor NetFlow

Traditional Security Analytics
21
Monitored
Network
Monitored
Network
The Rest
Of The
World
DNSDNSDNS
DHCPDHCP
Firewall
IDS/
IPS
Inline
Conventional
Setup
Detect Signatures
within Individual
Data Streams

Streaming Analytics
22
Monitored
Network
Monitored
Network
The Rest
Of The World
(Internet)
DNSDNSDNS
DHCPDHCP
Firewall
IDS/
IPS
Inline
Real-Time Streaming
Analytics Setup
Detect Signatures
within Individual
Data Streams
Real-Time
Cyber Security
Analytics
Detects behaviors by correlating
across diverse & massive data
streams via Analytics in Motion
Models learnt offline with
Analytics on Data at Rest
IDS/IPS Alerts…

Streaming Analytics for Fast-flux Botnets
23
DNS Response
Records
Suspected
Fast-flux
Domain
Names
JoinJoin
DNS Queries
(with internal querying host IP Addresses)
FastFlux
Analytics
FastFlux
Analytics
FastFlux
Analytics
FastFlux
Analytics
FastFlux
Analytics
FastFlux
Analytics
Candidate Names/IP's
with Confidence Values
AggregatorAggregator
Suspected
Fast-Flux
IP-addresses
JoinJoin
DHCP Traffic
(IP  MAC  System/Owner)
Fast-fluxing
Bot alerts
JoinJoin
Host LogsHost Logs
IPS AlertsIPS Alerts
…
Netflow

Use Case 2 - Detect Distributed Denial of Service Attacks in
ISPs
 DDOS attacks often launched by botnets to flood a target server
 Often use techniques to amplify the flooding
– E.g. DNS Amplification Attacks
 Very hard to detect and prevent in time
– Need to monitor 100s of Gbps
– Need to monitor millions of DNS requests per second
 Use InfoSphere Streams for running analytics for detecting DDOS
attacks
– Look for anomalies in DNS server requests
– Scale to internet level traffic rates
© 2013 IBM Corporation25

Use Case 3 - Detect Data-Leakage from organizations
 Determine what information employees (or bots) are sending out of
the company
– Look at the all information flowing out of the company to the outside world
– Determine if it contains any confidential or sensitive information
 Monitor what information employees (or bots) are seeing/accessing
– Determine if they are accessing sensitive information (even if they may have
the rights to access it)
– Determine if their access patterns are suddenly changing
• E.g. an employee that is suddenly accessing much more information than he (or
someone else in his role) typically accesses may want to sell this information outside
or leave the company
© 2013 IBM Corporation26

DNS Amplification Attack
Key characteristics: 1) Targeted attack victimizing hosts & servers 2) DNS service provider becomes a
participant and unavailable during attack 3) Attack attribution is hard
28
To delete

Big Data for CyberSecurity

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (20)

Más de Swiss Big Data User Group

Más de Swiss Big Data User Group (20)

Último

Último (20)

Big Data for CyberSecurity

Notas del editor