'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
Traffic analysis for Planning, Peering and Security by Julie Liu
1. Traffic Analysis for Peering, and
Security
Utilizing xFlow Technologies
August 2014
Julie Liu
2. Agenda
What is xFlow Technology?
What values can xFlow propose to xSPs?
Traffic Visibility for Peering Analysis
Infrastructure Security
3. What is xFlow Technology?
(A quick foreword, in case you are not familiar with it…)
Definition of a Flow
A unidirectional set of packets that arrive at a router on the same
interface, have the same source/destination IP addresses, Layer
4 protocol, TCP/UDP source/destination ports, and the same
ToS byte in the IP headers
A technology to gather
information on
forwarded packets
In router/switch caches
And exported to collectors
Client
Server
Request
Response
TWO flows for ONE TCP connection
Client ServerContent
ONE flow for ONE UDP Stream
Flow Cache Table
• Active Timeout
• Inactive timeout
4. How xFlow Can Benefit xSPs?
Traffic
Matrix
Visibility
Security
Protection
Capacity
Planning
xFlow
Collection
& Analysis
Traffic
Engineer-
ing
Peering
Analysis
Anomaly
Detection
In-cloud
Mitigation
6. Why Traffic Matrix Visibility?
Traffic Matrix Visibility
The amount of data transmitted between every pair of network
"instances" (router-level, Pop-level, network-level)
Provide end-to-end, network-wide traffic visibility, in contrast to
the individual link load stats
Traffic Matrix Visibility for what purposes?
Capacity planning (build capacity where needed)
Traffic engineering (steer traffic where capacity is available)
Peering Analysis (support peering decisions, TE at the border)
Better understand traffic patterns (what is normal or abnormal)
7. Challenges of xFlow Only
Flow duplicates
Collect from where?
Usually multiple Flow measurement sources in a data path
However, if collecting from multiple
xFlow sources in a data path, will
duplicate the Flow data results for
counting traffic toward a network
instance
Network topology data is
needed!
Count once on the
network boundary
of the instance to
be monitored
8. Challenges of xFlow Only
From point into path measurements
xFlow only
Can tell you where traffic is going now
Some simple information about origin-AS or peer-AS
Not only peer or origin, the transit ASes also matter!
Embed a BGP (passive) peer on the Flow Collector to correlate
Flow data with all the BGP attributes (path, communities, etc.)
Use of full AS Path information to determine where traffic is
going and coming from and how existing transit/peer is used
BGP carries the topology (i.e. path) information helps extend
local measure view to completely across the Internet
9. Peering Analysis
What is Peering? (Just a quick reminder…)
What is Peering?
The Internet is a collection of many individual networks (ASes),
who interconnect with each other under the common framework
of ensuring global reachability between any two points
There are 3 primary positions for this interconnection:
Transit Provider – Typically someone you pay money to, who has
the responsibility of routing your packets to/from the entire Internet
Transit Customer – Typically someone who pays you money, with
the expectation that you will route their packets to/from the entire
Internet
Peer – Two networks who get together and agree to exchange
traffic between each others’ networks, typically for free
10. Peering Analysis
Peering and its benefits…
One major benefit of Peering
Reduced operating costs
Peering traffic is “free”. If you no longer pay a transit provider to
deliver some portion of your traffic, it reduces your transit bills
Provider
A
Provider
B
Provider
C
Customer
Customer Customer Customer
Customer
Customer
Multi-homed
Customer
Peering
Transit
11. Peering Analysis
Why traffic visibility for Peering?
To decide if you should peer with a new network
To convince other networks to peer with you
To manage traffic engineering to other networks
To defend your network against depeering actions
To make intelligent transit purchasing decisions
12. Peering Analysis
Peering traffic requirements
Traffic Volume
A peer may be required to exchange a certain minimum amount
of traffic to be considered
Traffic Ratios
Inbound vs. outbound traffic ratio
Traffic is “hot potato” routed (i.e. get it off your network ASAP)
Push traffic coming from Network A gets hauled primarily by
Network B, and vice versa
If the ratio is 1:1, both peers share backhaul costs equally
Others: PoP requirements, interconnect locations, routing stability,
operations requirements, business concerns…
13. Peering Analysis
Peering evaluation questions
" A Business Case for Peering," William B. Norton
Does the AS send me about as much traffic as I send to it?
How much of the traffic originates from the potential peer?
Does the volume of traffic justify a direct peering effort?
How much traffic is transited through the potential peer?
AS101
AS100
AS21
ASC
AS23
AS4
AS1
AS2
AS3
Home
Internet
Peer AS
Origin AS
Transit AS
14. Peering Analysis
Route-flow fusion analysis answers this…
" A Business Case for Peering," William B. Norton
Source-sink/transit traffic distribution
TopN ASNs sourcing-sinking/transiting traffic with me1
In/Out traffic ratio2
3
15. Peering Analysis
Peering cost analysis
In theory, peering is “free” right?
The fact is that the overhead associated with peering can be
higher than transit costs (if the peered traffic is not huge enough)
How much does it save/ cost?
Which transit
provider(s)?
How much transit traffic
can be offloaded?
US$
Internet
Transit Price
Transit A $1.6 per Mbps
Transit B $1.8 per Mbps
Transit C $1.2 per Mbps
AS101
AS100
AS21Peer
Candidate
AS23
AS4
Transit A
Transit B
Transit C
Home
Internet
17. Infrastructure Security Threats
DDoS attacks
DDoS attack traffic
consumes SP network
capacity
DDoS attack traffic
saturates in-line security
devices
DDoS attacks launched
from compromised
systems (bots)
DDoS attack traffic
targets applications and
services
Internet
Service Provider
Network
Enterprise or
IDC
Bots
Victim
Why traditional in-line security solution fails preventing infrastructure security threats?
Volumetric attacks must be removed from the cloud
Tradition security products are easy targets of it (stateful in-line solution)
Deployment costs
Single point of failure and latency
Anomaly traffic
Normal traffic
18. Infrastructure Security
A Flow-based solution
Flow-based solution building blocks
Flow-based
Learning
Flow-based
Detection
Cloud-based
Mitigation
•Network-wide: Collects xFlow
data from various router locations
and correlates the data into a
comprehensive network model
•Dynamic Behaviour Analysis:
During peace time, the system
creates a network-wide view of the
traffic patterns and learns
thresholds for representing 'what
is 'normal'
•Detection Engines:
compare the collected
real-time Flow data
and thresholds
•Once significant
threshold violations
identified, the system
sends alarms and
enable cloud-based
mitigation actions
•Cloud-based
mitigation
action options:
- Remote
Triggered Black
Hole (RTBH)
- BGP FlowSpec
-OOP Traffic
Cleaning
19. Flow-based Learning & Detection
The idea
Flow-based Network Behavior Anomaly Detection
(NBAD)
DOES:
Analyze Flows data (IP header info, byte/pkt count) from routers
Detect anomalies by observing network traffic behaviors – knowing
what is normal, and hence identify abnormal when it happens
DOESN'T:
Analyze L7, packet contents from raw packets
Detect anomalies by matching content signatures – knowing what is
bad, and then catch the bad from the good
First-line protection for the network infrastructure
Trading DPI precision off for carrier-grade scalability and performance
20. Flow-based Learning & Detection
Network behavior analysis examples
What's Normal? What can be Abnormal? Example
A server accepts requests
from clients
Over 5,000 SYN requests per
second and lasts over 3 minutes
TCP SYN Flooding
A client connects to few
destination hosts / ports
Over 100 connection requests per
second to destination hosts / ports
Port Scan / IP Scan
Various packet sizes Fixed packet size (e.g. UDP/1434,
packet size = 404)
SQL Slammer
The source address ≠ the
destination address
The source address is the same as
the destination address
LAND Attack
The traffic rate for this
network scope is usually
around 150M bps
Over 180M bps traffic rate appears
in this network scope
Zero-day attack
(generic traffic
floods)
21. Flow-based Learning & Detection
The mechanisms
Flow-based NBADMechanism Type Detection Engine Examples
Fingerprint-based
Protocol anomaly TCP Flag Null, IP Fragment, IP Protocol Null,
Land Attack, Ping of death, TCP XMAS attack…
Flood attack ICMP Flooding, UDP Flooding, TCP SYN
Flooding, TCP RST Flooding, TCP ACK Flooding…
Specific behaviour
attack
IP Scan, Port Scan, DNS Flooding, e-Mail Spam,
Trojan Heloag, MS Blaster, Sasser, Code Red,
SQL Slammer…
Baseline Heuristic Baseline deviation Zero-day attacks (generic traffic floods)
1-Jul 11-Jul 21-Jul 31-Jul
TrafficLevel
Learning peacetime Flow data samples
"Baseline": what is the "normal"
traffic rates?
22. Infrastructure Security
Cloud-based mitigation with RTBH
All traffic to the victim is
discarded
Remotely triggered black
hole filtering at SP edge
BGP prefix with next-
hop set to a pre-defined
black hole route
Internet Service Provider
Network
Enterprise or
IDC
Bots
Victim
RTBH RTBH
Anomaly traffic
Normal traffic
BGP announcement
23. Infrastructure Security
Cloud-based mitigation w/ BGP FlowSpec
Suspicious traffic recognized is
filtered at the SP network edge
Only filtered traffic is delivered
to the enterprise/IDC network
BGP FlowSpec
distributes traffic filter
lists to routers
Internet Service Provider
Network
Enterprise or
IDC
Bots
Victim
RFC 5575;Selectively drop traffic flows based on L3/L4 information
FlowSpec
Anomaly traffic
Normal traffic
BGP FlowSpec
FlowSpec
24. Infrastructure Security
Cloud-based mitigation w/ OOP cleaning
Suspicious traffic is diverted at the
SP network edge
Divert victim prefix
traffic via BGP
Internet Service Provider
Network
Enterprise or
IDC
Bots
Victim
The "Cleaning Centre" is typically a shared resource in the network infrastructure to
reduce the deployment costs
Malicious traffic
Benign traffic
BGP announcement
Cleaned traffic
tunnelled back
DPI-capable mitigation appliance
(application-layer attack,
asymmetric detection)
Cleaning
Centre
No impacts to
other traffic to
other networks
26. Flow Technology
Network Resource Impact Issues
NetFlow data volume? 1K FPS ≒ 338K bps NetFlow traffic
However, to estimate Flows/ second based on the given network traffic
bps is a much more complex task!
Typically 1~4% link rate
Leverage data reduction techniques:
Partial coverage (i.e. a few POPs, selective boundaries)
Tune the active & inactive timeouts
Flow Sampling
In addition to the data volume, 'full NetFlow’ may inflict a burden on
memory and router CPU intensive. Therefore sampled xFlow is
preferred…
Flow/sec Pkt/sec Byte/sec bps
1,000 33 49,500 338.37K
27. Flow Technology
Flow Sampling
To alleviate the performance penalty incurred by
turning on xFlow on routers
Allow users to sample one out of every “N” IP packets being forwarded
(a user can configure the “N” interval)
Substantially decreases the CPU utilization needed to account for
Flow packets
CPU utilization varies, depending on the sampling rate and the routers
Example:
Cisco 12000 Series Router to handle 65K flows
In “full-flow” mode required 24% more CPU; the same router using 1:100
sampling required only 3% additional CPU
Cisco 7500 Router
28.
29. References
Yann Berthier, "NetFlow to guard the infrastructure," NANOG 39,
2007
Thomas Telkamp, “Best Practices for Determining the Traffic Matrix
in IP Networks V 3.0,” NANOG 39, 2007
Richard A Steenbergen, "A Guide to Peering on the Internet,"
NANOG 51, 2011
William B. Norton, " A Business Case for Peering in 2010,"
http://drpeering.net/white-papers/A-Business-Case-For-Peering.php
RFC 5575, Dissemination of Flow Specification Rules
Leonardo Serodio, "Traffic Diversion Techniques for DDoS
Mitigation using BGP Flowspec," NANOG 58, 2013
Cisco Systems Inc., "NetFlow Performance Analysis," 2007