1. TCP Issues in Virtualized Datacenter
Networks
Hemanth Kumar Mantri
Department of Computer Science 1 of 27
2. Selected Papers
• The TCP Outcast Problem: Exposing
Unfairness in Data Center Networks.
– NSDI’12
• vSnoop: Improving TCP Throughput in
VirtualizedEnvironments via Ack Offload.
– ACM/IEEE SC, 2010
2 of 27
3. Background and Motivation
• Data center is a shared environment
– Multi Tenancy
• Virtualization: A key enabler of cloud
computing
– Amazon EC2
• Resource sharing
– CPU/Memory are strictly shared
– Network sharing largely laissez-faire
3 of 27
4. Data Center Networks
• Flows compete via TCP
• Ideally, TCP should achieve true fairness
– All flows get equal share of link capacity
• In practice, TCP exhibits RTT-bias
– Throughput is inversely proportional to RTT
• 2 Major Issues
– Unfairness (in general)
– Low Throughput (in virtualized environments)
4 of 27
8. Further Investigation
Instantaneous Average
2-hop flow is consistently starved!!
TCP Outcast Problem
• Some Flows are ‘Outcast’ed and receive very low
throughput compared to others
• Almost an order of magnitude reduction in some
cases
8 of 27
9. Experiments
• Same RTTs
• Same Hop Length
• Unsynchronized Flows
• Introduce Background Traffic
• Vary Switch Buffer Size
• Vary TCP
– RENO, MP-TCP, BIC, Cubic + SACK
• Unfairness Persists! 9 of 27
12. Reason: Port Blackout
1. Packets are roughly same size
2. Similar inter-arrival rates (Predictable Timing) 12 of 27
13. Port Blackout
• Can occur on any input port
• Happens for small intervals of time
• Has more catastrophic effect on
throughput of fewer flows!!
– Experiments showed that “same number” of
packet drops affect the throughput of fewer
flows much more than if there were several
concurrent flows.
13 of 27
15. Solutions?
• Stochastic Fair Queuing (SFQ)
– Explicitly enforce fairness among flows
– Expensive for commodity switches
• Equal Length Routing
– All flows are forced to go through Core
– Better interleaving of packets, alleviate PB
15 of 27
16. • Multiple VMs hosted by one physical host
• Multiple VMs sharing the same core
– Flexibility, scalability, and economy
VM Consolidation
Hardware
Virtualization Layer
VM 1 VM 3 VM 4VM 2
Observation:
VM consolidation negatively
impacts network performance!
16 of 27
20. Connection to the VM is much
slower than dom0!
Impact on TCP Throughput
+ dom0
x VM
20 of 27
21. Solution: vSnoop
• Alleviates the negative effect of VM scheduling on
TCP throughput
• Implemented within the driver domain to
accelerate TCP connections
• Does not require any modifications to the VM
• Does not violate end-to-end TCP semantics
• Applicable across a wide range of VMMs
– Xen, VMware, KVM, etc.
21 of 27
22. Sender VM1 BufferDriver Domain
Time
SYN
SYN,ACK
SYN
SYN,ACK
VM1 buffer
TCP Connection to a VM
Scheduled VM
VM1
VM2
VM3
VM1
VM2
VM3
SYN,ACK
SYN
VM Scheduling
Latency
RTT
RTT
VM Scheduling
Latency
Sender establishes a TCP
connection to VM1
22 of 27
23. Sender VM Shared BufferDriver Domain
Time
SYN
SYN,ACK
SYN
SYN,ACK
VM1 buffer
Key Idea: Acknowledgement Offload
Scheduled VM
VM1
VM2
VM3
VM1
VM2
VM3
SYN,ACK
w/ vSnoop
Faster progress during
TCP slowstart
23 of 27
24. • Challenge 1: Out-of-order/special packets (SYN, FIN packets)
• Solution: Let the VM handle these packets
• Challenge 2: Packet loss after vSnoop
• Solution: Let vSnoop acknowledge only if room in buffer
• Challenge 3: ACKs generated by the VM
• Solution: Suppress/rewrite ACKs already generated by vSnoop
Challenges
24 of 27
27. Thank You!
• References
– http://friends.cs.purdue.edu/dokuwiki/doku.php
– https://www.usenix.org/conference/nsdi12/tech-
schedule/technical-sessions
• Most animations and pictures are taken from
the authors’ original slides and NSDI’12
conference talk.
27 of 27
29. Conditions for Outcast
• Switches use the tail-drop queue
management discipline
• A large set of flows and a small set of
flows arriving at two different input ports
compete for a bottleneck output port at a
switch
29
30. Why does Unfairness Matter?
• Multi Tenant Clouds
– Some tenants get better performance than
others
• Map Reduce Apps
– Straggler problems
– One delayed flow affects overall job
completion
30
31. State Machine Maintained Per-
FlowStart
Unexpected
Sequence
Active
(online)
No buffer
(offline)
Out-of-order
packet
In-order pkt
Buffer space available
Out-of-order
packet
In-order pkt
No buffer
In-order pkt
Buffer space available
No buffer
Packet recv
Early acknowledgements
for in-order packets
Don’t
acknowledge
Pass out-of-order
pkts to VM
31
32. vSnoop’s Impact on TCP Flows
• Slow Start
– Early acknowledgements help progress
connections faster
– Most significant benefit for short transfers that are
more prevalent in data centers
• Congestion Avoidance and Fast Retransmit
– Large flows in the steady state can also benefit
from vSnoop
– Benefit not as much as for Slow Start 32