SlideShare una empresa de Scribd logo
1 de 40
Data Center Network
                             Multipathing
Peregrine: An All-Layer-2 Container Computer Network
Tzi-cker Chiueh*§, Cheng-Chun Tu*§, Yu-Cheng Wang§, Pai-Wei Wang§, Kai-Wen Li§, Yu-Ming Huang§
*Industrial Technology Research Institute, Taiwan
§Computer   Science Department, Stony Brook University
IEEE Cloud 2012

Leveraging Performance of Multiroot Data Center Networks by Reactive Reroute
Adrian S.-W. Tam, Kang Xi H,. Jonathan Chao
Department of Electrical and Computer Engineering, Polytechnic Institute of New York Universit
2010 18th IEEE Symposium on High Performance Interconnects



              Presenter: Jason, Tsung-Cheng, HOU
              Advisor: Wanjiun Liao
                                                                                May 17th, 2012   1
Motivation
• Summarize features of the popular multi-root
  Clos / fat-tree data center topology
  Take ITRI’s prototype as an example
• Surveyed solutions of multipathing
• Recap Jin-Jia Chang’s presentation on QCN
• Present another solution to multipathing
• Compare several multipathing methods



                                             2
Agenda
•    Multi-Root Clos / Fat-Tree Topology
•    Surveyed Solutions to Multipathing
•    802.1Qau – QCN
•    QCN and Reactive Reroute
•    Comparison of Multipathing Methods
    Peregrine: An All-Layer-2 Container Computer Network
    Tzi-cker Chiueh*§, Cheng-Chun Tu*§, Yu-Cheng Wang§, Pai-Wei Wang§, Kai-Wen Li§, Yu-Ming Huang§
    *Industrial Technology Research Institute, Taiwan
    §Computer   Science Department, Stony Brook University
    IEEE Cloud 2012




                                                                                                     3
Multi-Root Clos / Fat-Tree
• Adopted by various publications
    – VL2, PortLand, BCube, Elastic Tree, Peregrine
• Scale-out, cheap commodity switches
• Through fixed maximum switches / hops
    – If no bouncing, no routing loop
•   Nearly full bisection, multipathing, symmetric
•   Possibly tremendous routing table entries
•   Up and down paths, handled differently
•   High rate but limited capability, buffer, CPU..
                                                      4
High rate but limited capability
•   All-L2 Ethernet switches
•   Up to 1 GE or 10 GE links, dozens ports
•   Limited buffer, hundred K bytes
•   Limited CPU ability, processing bottleneck
•   Limited flow table entries, at most dozen Ks
•   Optimized for fast table lookups
•   Take Peregrine for example
    – ITRT’s industrial, commodity production prototype
    – Others, mostly experimental or high-end
                                                    5
Topology: Folded Clos
         cross container




A rack

  12 racks                 A container
                                          6
Within One Rack




                  7
Within One Container


                       5-to-5 per rack
                       But only 4 ports




                                  8
DS and RAS
• Directory Server
  – Address association, mgmt, and reuse
  – Performs IP-MAC lookup, mappings
  – Updates mappings to end hosts
• Route Algorithm Server
  – Collects entries of the traffic matrix
  – Runs load-balancing algorithms, based on TM
  – Distributes routing entries to switches, update DS
• Within one container, cross-container unclear
• Scalability unclear, VM mobility unclear
  (Only refers to sth like mobile IP)         9
Routing, Balancing, and Tolerance




                                10
Logical Architecture




                       11
Dual-Mode Forwarding




                       12
Switching to Backup




                      13
ITRI Container Computer Prototype
• 6.096m shipping container
• 12 server racks, 12 storage racks
• All-L2 network, commodity switches
• “Folded” Clos topology
• Directory Server, Route Algorithm Server
• Unclear: Load-balancing algo., VM mobility,
  DS-RAS scalability, cross-container
• In the future: OpenFlow, OpenStack
  (Currently not using OpenFlow to connect
  switches… how? unclear)
                                                14
Discussions
• Spanning tree for multipathing and load-
  balancing: Simple but limited flexibility
• How to plug and play? Scalable?
  – A new switch leads to reconfiguration
  – VM migration = affects TM and direct routes?
• DS-RAS: a simple version of controller
  But mechanism, performance unclear
• Seems to be trying to combined various
  advantages: Address mapping, ST
  multipathing, converged network, folded-Clos
                                                   15
Agenda
•   Multi-Root Clos / Fat-Tree Topology
•   Surveyed Solutions to Multipathing
•   802.1Qau – QCN
•   QCN and Reactive Reroute
•   Comparison of Multipathing Methods




                                          16
Multipathing
• VLB:
  – Traffic splits to intermediate points
  – Automatically balances load
  – Ideally great, but subject to PKT reordering
• ECMP-hashing
  – Different hashing functions, big difference
  – Flow always sticks to one path during transmit
• Hedera:
  – Flow-to-core mapping, flow scheduling
  – Requires global information, higher complexity
                                                     17
Multipathing
• Spanning Tree / VLAN: (Spain)
  – Near-static, pre-computation required, but simple
  – Re-computes when topology changes
  – Segmentation of resources, limited flexibility
• Multipath TCP:
  – One flow, many parallel paths
  – VLAN-based routing in publication (like Spain)
  – Shifts traffic to less congested paths
  – A new transport mechanism, adaptive
  – Still with segmentation of resources
                                                     18
Multipathing References
• M. Kodialam, T. V. Kakshman, S. Sengupta, “Efficient and Robust Routing of Highly
  Variable Traffic”, HotHets, 2004.
• R. Zhang-Shen and N. McKeown “Designing a Predictable Internet Backbone Network”,
  Third Workshop on Hot Topics in Networks (HotNets-III), November 2004.
• A. Greenberg et al., “VL2: A Scalable and Flexible Data Center Network”, ACM SIGCOMM
  2009.
• M YSORE, R. N., PAMPORIS, A., FARRINGTON, N., H UANG, N., MIRI , P., R ADHAKRISHNAN,
  S., S UBRAMANYA, V., AND VAHDAT, A. “PortLand: A Scalable, Fault-Tolerant Layer 2 Data
  Center Network Fabric.” In Proceedings of ACM SIGCOMM, 2009.
• M. Al-Fares, et. al., “Hedera: Dynamic Flow Scheduling for Data Center Network”,
  USENIX NSDI 2010.
• J. Mudigonda, P. Yalagandula, M. Al-Fares, and J. C. Mogul. “SPAIN: COTS Data-Center
  Ethernet for Multipathing over Arbitrary Topologies.” In USENIX NSDI, April 2010.
• C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik, and M. Handley. “Data center
  networking with multipath TCP.” In HotNets, 2010.




                                                                                    19
Agenda
•    Multi-Root Clos / Fat-Tree Topology
•    Surveyed Solutions to Multipathing
•    802.1Qau – QCN
•    QCN and Reactive Reroute
•    Comparison of Multipathing Methods
    Data center transport mechanisms: Congestion control theory and IEEE
    standardization
    M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman,
    Communication, Control, and Computing, 2008 46th Annual Allerton Conference on

    AF-QCN: Approximate fairness with quantized congestion notification for
    multitenanted data centers
    A. Kabbani, M. Alizadeh, M. Yasuda, R. Pan, and B. Prabhakar,
    B. In High Performance Interconnects (HOTI), 2010, IEEE 18th Annual Symposium on

                                                                                                   20
Data Center Bridging Task Group
• Converged network
  – LAN: no priority control
     Qbb: Priority-based Flow Control
  – FCoE (SAN): no congestion control
     Qau: Quantized Congestion Notification
• Need to survey more on converged network
  – Respective features and requirements
  – Could be a very important trend



                                              21
QCN
• CP: Congestion Point
  – A switch monitors queue, Q, Qeg, Qold
  – Samples and sends Fb msg to RP
  – Fb a combination of (queue, rate) excess
  – Targets for no PKT loss
• RP: Reaction Point
  – A host with Rate Limiter, Counter, and Timer
  – Retries for more BW like AIMD
  – Decreases according to Fb msg
  – Counter and Timer both controls RL
                                                   22
QCN




      23
QCN




      24
AF-QCN




         25
Modify Fb Msg to Imply More




                              26
Agenda
•    Multi-Root Clos / Fat-Tree Topology
•    Surveyed Solutions to Multipathing
•    802.1Qau – QCN
•    QCN and Reactive Reroute
•    Comparison of Multipathing Methods
    Leveraging Performance of Multiroot Data Center Networks by Reactive Reroute
    Adrian S.-W. Tam, Kang Xi H,. Jonathan Chao
    Department of Electrical and Computer Engineering, Polytechnic Institute of New York Universit




                                                                                                     27
Exploit Multipath Property
• Use QCN to further leverage redundancy
  – Per-flow CN adjusts BW: Spectral
  – Relocates flows among paths: Spatial
  – Both mitigates congestions
• Multiroot, Clos / fat-tree topology
  – Upward: destination based, deterministic
  – Downward: could be randomized or rerouted
• Hashed ECMP: Distributes flow population
• Flow-reroute: Balancing congested links

                                                28
Reactive Reroute
• Edge switches counts received QCNs-Ports
  – Only edge switches will reroute, consider enough
  – Only for upward PKTs, not for downward
• Reroutes flows (elephant && congested),
  detects by counting QCNs in a short period
• Three reroute methods:
  – Uniform random
  – Min. prob. of congestion (conditional prob.)
  – Weighted of above two
• Freezes a rerouted flow to avoid flapping
                                                   29
Algorithm Pseudo Code




       Only when within a short period


                                         30
NS-3 Simulation




• Simulation for 1 second
• Also a TCP simulation



                            31
Throughput and Latency




                         32
Outlier Latency




•   Very large flows are throttled by L2 congestion
    control, thus with large latency
•   60% within 1ms, but in average it takes 15ms!
                                                      33
Discussion
• Why Min. reroute is always worse?
  – Some flows’ path overlap in the beginning
  – Edge switches have no global information
  – Receives QCN from the same (port, agg)
      Synchronized reroute
• Operates a centralized controller?
  – Authors argue that gain is very small
  – But they do not present more on the “outliers”
  – The flows with longest latencies, the larger
  – The larger flows could be some vital connections
                                                  34
Discussion
• L2 congestion control protects TCP over UDP
• No PKT loss, almost no incast problem
• Out-of-order problem is more severe for UDP
• However, because switch buffer is tightly
  monitored, the number of out-of-order PKTs is
  limited at most as (5nr/s)
  (n: buffer size) (r: sending rate) (s: link rate)
• Freezes a rerouted flow: Also limits reordering


                                                35
Agenda
•    Multi-Root Clos / Fat-Tree Topology
•    Surveyed Solutions to Multipathing
•    802.1Qau – QCN
•    QCN and Reactive Reroute
•    Comparison of Multipathing Methods
    Comparative Evaluation of CEE-based Switch Adaptive Routing
    Daniel Crisan, Mitch Gusat, Cyriel Minkenberg,
    2nd Workshop on Data Center - Converged and Virtual Ethernet Switching (DC CAVES),
    2010




                                                                                   36
Multipathing Methods
• Deterministic, static, or preconfigured
  – Single fixed path
  – VLAN-based, multiple fixed paths, ST-per-VLAN
• Oblivious, randomized
  – Hashed by headers
  – Split to intermediaries
• Reactive, switch adaptive routing
• Controller-enabled centralized scheduling


                                               37
Comparison
• Deterministic, static, or preconfigured
  – Simple, no re-ordering
• Oblivious, randomized, good when…
  – Single prio., symmetric traffic
• Reactive, switch adaptive routing, realistic…
  – Multiple prio., asymmetric
• Controller-enabled centralized scheduling
  – Large input set, higher complexity
  – Controller hard to implement, high cost low gain?
• Convergence and virtualization are trends
                                                  38
Discussion
• Data center traffic patterns are evolving and
  unknown a priori in many cases
• Justifies multiple routing / balancing schemes
  Currently no single killer solution
• Should be able to switch between modes
  Reactive-Adaptive and Randomized
• Role of controller still to be optimized
  – Could be useful for criti cal flows / situation
  – Detect and react in slower manner
  – Not ideal for dynamic fast reaction
                                                      39
Reference
•   Tzi-cker Chiueh, Cheng-Chun Tu, Yu-Cheng Wang, Pai-Wei Wang, Kai-Wen Li, Yu-Ming Huang ,
    “Peregrine: An All-Layer-2 Container Computer Network”, IEEE Cloud 2012
•   M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman, “Data
    center transport mechanisms: Congestion control theory and IEEE standardization,”
    Communication, Control, and Computing, 2008 46th Annual Allerton Conference on
•   A. Kabbani, M. Alizadeh, M. Yasuda, R. Pan, and B. Prabhakar. “AF-QCN: Approximate fairness with
    quantized congestion notification for multitenanted data centers”, In High Performance
    Interconnects (HOTI), 2010, IEEE 18th Annual Symposium on
•   Adrian S.-W. Tam, Kang Xi H., Jonathan Chao , “Leveraging Performance of Multiroot Data Center
    Networks by Reactive Reroute”, 2010 18th IEEE Symposium on High Performance Interconnects
•   Daniel Crisan, Mitch Gusat, Cyriel Minkenberg, “Comparative Evaluation of CEE-based Switch
    Adaptive Routing”, 2nd Workshop on Data Center - Converged and Virtual Ethernet Switching (DC
    CAVES), 2010




                                                                                                   40

Más contenido relacionado

La actualidad más candente

integrated and diffrentiated services
 integrated and diffrentiated services integrated and diffrentiated services
integrated and diffrentiated servicesRishabh Gupta
 
AusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAPNIC
 
Introduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesIntroduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesJason TC HOU (侯宗成)
 
Traffic Characterization
Traffic CharacterizationTraffic Characterization
Traffic CharacterizationIsmail Mukiibi
 
Qos Quality of services
Qos   Quality of services Qos   Quality of services
Qos Quality of services HayderThary
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
 
Traffic Engineering in Software-Defined Networks
Traffic Engineering in Software-Defined NetworksTraffic Engineering in Software-Defined Networks
Traffic Engineering in Software-Defined NetworksHai Dinh Tuan
 
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...Tal Lavian Ph.D.
 
A load balancing model based on cloud partitioning for the public cloud. ppt
A  load balancing model based on cloud partitioning for the public cloud. ppt A  load balancing model based on cloud partitioning for the public cloud. ppt
A load balancing model based on cloud partitioning for the public cloud. ppt Lavanya Vigrahala
 

La actualidad más candente (20)

Qo s 09-integrated and red
Qo s 09-integrated and redQo s 09-integrated and red
Qo s 09-integrated and red
 
integrated and diffrentiated services
 integrated and diffrentiated services integrated and diffrentiated services
integrated and diffrentiated services
 
Lecture 7
 Lecture 7 Lecture 7
Lecture 7
 
Link_NwkingforDevOps
Link_NwkingforDevOpsLink_NwkingforDevOps
Link_NwkingforDevOps
 
AusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBR
 
Introduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesIntroduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network Issues
 
Traffic Characterization
Traffic CharacterizationTraffic Characterization
Traffic Characterization
 
Quality of service
Quality of serviceQuality of service
Quality of service
 
C2C communication
C2C communicationC2C communication
C2C communication
 
Qos Quality of services
Qos   Quality of services Qos   Quality of services
Qos Quality of services
 
XenApp Load Balancing
XenApp Load BalancingXenApp Load Balancing
XenApp Load Balancing
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten Clustering
 
Traffic Engineering in Software-Defined Networks
Traffic Engineering in Software-Defined NetworksTraffic Engineering in Software-Defined Networks
Traffic Engineering in Software-Defined Networks
 
Quality of Service
Quality  of  ServiceQuality  of  Service
Quality of Service
 
Advanced networking - scheduling and QoS part 1
Advanced networking - scheduling and QoS part 1Advanced networking - scheduling and QoS part 1
Advanced networking - scheduling and QoS part 1
 
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...
A Platform for Data Intensive Services Enabled by Next Generation Dynamic Opt...
 
A load balancing model based on cloud partitioning for the public cloud. ppt
A  load balancing model based on cloud partitioning for the public cloud. ppt A  load balancing model based on cloud partitioning for the public cloud. ppt
A load balancing model based on cloud partitioning for the public cloud. ppt
 
Quality of service
Quality of serviceQuality of service
Quality of service
 
Alternative metrics
Alternative metricsAlternative metrics
Alternative metrics
 
Load Balancing Server
Load Balancing ServerLoad Balancing Server
Load Balancing Server
 

Destacado

Data Center 3.0 Star Trek
Data Center 3.0 Star TrekData Center 3.0 Star Trek
Data Center 3.0 Star TrekBill Petro
 
Presentation11
Presentation11Presentation11
Presentation11KellyCheah
 
Wireless sensor open flow
Wireless sensor open flowWireless sensor open flow
Wireless sensor open flowKellyCheah
 
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATION
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATIONSDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATION
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATIONOpen Networking Summits
 
All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight Mark Hinkle
 
ONS content extraction
ONS content extractionONS content extraction
ONS content extractionKellyCheah
 
presentationGAATT
presentationGAATTpresentationGAATT
presentationGAATTKellyCheah
 
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...Open Networking Summits
 
Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)sangyun han
 
Spreading NFV through the Network: the ETSI NFV use cases
Spreading NFV through the Network: the ETSI NFV use casesSpreading NFV through the Network: the ETSI NFV use cases
Spreading NFV through the Network: the ETSI NFV use casesOpen Networking Summits
 
Operationalizing BGP in the SDDC
Operationalizing BGP in the SDDCOperationalizing BGP in the SDDC
Operationalizing BGP in the SDDCCumulus Networks
 
Deploying Hyperscale SDN and NFV in Next-Generation Data Centers
Deploying Hyperscale SDN and NFV in Next-Generation Data CentersDeploying Hyperscale SDN and NFV in Next-Generation Data Centers
Deploying Hyperscale SDN and NFV in Next-Generation Data CentersRadisys Corporation
 
Onos overview meetup sdn paris - redux
Onos overview  meetup sdn paris - reduxOnos overview  meetup sdn paris - redux
Onos overview meetup sdn paris - reduxSDN_Paris
 
Summit 16: Open-O Mini-Summit - Open Source Evolution for Carriers
Summit 16: Open-O Mini-Summit - Open Source Evolution for CarriersSummit 16: Open-O Mini-Summit - Open Source Evolution for Carriers
Summit 16: Open-O Mini-Summit - Open Source Evolution for CarriersOPNFV
 
SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014Kyle Bader
 
Summit 16: Open-O Mini-Summit - OPNFV & Open-O
Summit 16: Open-O Mini-Summit - OPNFV & Open-OSummit 16: Open-O Mini-Summit - OPNFV & Open-O
Summit 16: Open-O Mini-Summit - OPNFV & Open-OOPNFV
 
ONOS build 2016 Sharing
ONOS build 2016 SharingONOS build 2016 Sharing
ONOS build 2016 SharingChun Ming Ou
 

Destacado (20)

Data Center 3.0 Star Trek
Data Center 3.0 Star TrekData Center 3.0 Star Trek
Data Center 3.0 Star Trek
 
User-Defined Network Cloud
User-Defined Network CloudUser-Defined Network Cloud
User-Defined Network Cloud
 
Presentation11
Presentation11Presentation11
Presentation11
 
Wireless sensor open flow
Wireless sensor open flowWireless sensor open flow
Wireless sensor open flow
 
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATION
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATIONSDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATION
SDN & OPTICAL FLOW STEERING FOR NETWORK FUNCTION VIRTUALIZATION
 
All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight
 
Presentation1
Presentation1Presentation1
Presentation1
 
App 的隱形殺手 - 留存率
App 的隱形殺手 - 留存率App 的隱形殺手 - 留存率
App 的隱形殺手 - 留存率
 
ONS content extraction
ONS content extractionONS content extraction
ONS content extraction
 
presentationGAATT
presentationGAATTpresentationGAATT
presentationGAATT
 
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
 
Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)
 
Spreading NFV through the Network: the ETSI NFV use cases
Spreading NFV through the Network: the ETSI NFV use casesSpreading NFV through the Network: the ETSI NFV use cases
Spreading NFV through the Network: the ETSI NFV use cases
 
Operationalizing BGP in the SDDC
Operationalizing BGP in the SDDCOperationalizing BGP in the SDDC
Operationalizing BGP in the SDDC
 
Deploying Hyperscale SDN and NFV in Next-Generation Data Centers
Deploying Hyperscale SDN and NFV in Next-Generation Data CentersDeploying Hyperscale SDN and NFV in Next-Generation Data Centers
Deploying Hyperscale SDN and NFV in Next-Generation Data Centers
 
Onos overview meetup sdn paris - redux
Onos overview  meetup sdn paris - reduxOnos overview  meetup sdn paris - redux
Onos overview meetup sdn paris - redux
 
Summit 16: Open-O Mini-Summit - Open Source Evolution for Carriers
Summit 16: Open-O Mini-Summit - Open Source Evolution for CarriersSummit 16: Open-O Mini-Summit - Open Source Evolution for Carriers
Summit 16: Open-O Mini-Summit - Open Source Evolution for Carriers
 
SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014
 
Summit 16: Open-O Mini-Summit - OPNFV & Open-O
Summit 16: Open-O Mini-Summit - OPNFV & Open-OSummit 16: Open-O Mini-Summit - OPNFV & Open-O
Summit 16: Open-O Mini-Summit - OPNFV & Open-O
 
ONOS build 2016 Sharing
ONOS build 2016 SharingONOS build 2016 Sharing
ONOS build 2016 Sharing
 

Similar a Data Center Network Multipathing

Introduction to backwards learning algorithm
Introduction to backwards learning algorithmIntroduction to backwards learning algorithm
Introduction to backwards learning algorithmRoshan Karunarathna
 
Rapid Survey on Routing in Data Centers
Rapid Survey on Routing in Data CentersRapid Survey on Routing in Data Centers
Rapid Survey on Routing in Data CentersGhazal Tashakor
 
Energy Efficient Routing Approaches in Ad-hoc Networks
                Energy Efficient Routing Approaches in Ad-hoc Networks                Energy Efficient Routing Approaches in Ad-hoc Networks
Energy Efficient Routing Approaches in Ad-hoc NetworksKishan Patel
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3Jeong, Wookjae
 
CS553_ST7_Ch15-LANOverview (1).ppt
CS553_ST7_Ch15-LANOverview (1).pptCS553_ST7_Ch15-LANOverview (1).ppt
CS553_ST7_Ch15-LANOverview (1).pptMekiPetitSeg
 
CS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptCS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptSmitNiks
 
CS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptCS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptssuser2cc0d4
 
Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Infinera
 
Cable Metro Packet Optical Transport
Cable Metro  Packet Optical TransportCable Metro  Packet Optical Transport
Cable Metro Packet Optical TransportJuniper Networks
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureGunawan Jusuf
 
24-ad-hoc.ppt
24-ad-hoc.ppt24-ad-hoc.ppt
24-ad-hoc.pptsumadi26
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
Dcn invited ecoc2018_short
Dcn invited ecoc2018_shortDcn invited ecoc2018_short
Dcn invited ecoc2018_shortShuangyi Yan
 
S.t rajan cjb0912010 ft12
S.t rajan cjb0912010 ft12S.t rajan cjb0912010 ft12
S.t rajan cjb0912010 ft12RAJAN ST
 
Introduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureIntroduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureAnkita Mahajan
 
Dcnintroduction 141010054657-conversion-gate01
Dcnintroduction 141010054657-conversion-gate01Dcnintroduction 141010054657-conversion-gate01
Dcnintroduction 141010054657-conversion-gate01yibeltal yideg
 

Similar a Data Center Network Multipathing (20)

Introduction to backwards learning algorithm
Introduction to backwards learning algorithmIntroduction to backwards learning algorithm
Introduction to backwards learning algorithm
 
Rapid Survey on Routing in Data Centers
Rapid Survey on Routing in Data CentersRapid Survey on Routing in Data Centers
Rapid Survey on Routing in Data Centers
 
Energy Efficient Routing Approaches in Ad-hoc Networks
                Energy Efficient Routing Approaches in Ad-hoc Networks                Energy Efficient Routing Approaches in Ad-hoc Networks
Energy Efficient Routing Approaches in Ad-hoc Networks
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3
 
Advanced networking scheduling and QoS part 2
Advanced networking scheduling and QoS part 2Advanced networking scheduling and QoS part 2
Advanced networking scheduling and QoS part 2
 
CS553_ST7_Ch15-LANOverview (1).ppt
CS553_ST7_Ch15-LANOverview (1).pptCS553_ST7_Ch15-LANOverview (1).ppt
CS553_ST7_Ch15-LANOverview (1).ppt
 
CS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptCS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.ppt
 
CS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.pptCS553_ST7_Ch15-LANOverview.ppt
CS553_ST7_Ch15-LANOverview.ppt
 
Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN Traffic Optimization in Multi-Layered WANs using SDN
Traffic Optimization in Multi-Layered WANs using SDN
 
Cable Metro Packet Optical Transport
Cable Metro  Packet Optical TransportCable Metro  Packet Optical Transport
Cable Metro Packet Optical Transport
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture
 
24-ad-hoc.ppt
24-ad-hoc.ppt24-ad-hoc.ppt
24-ad-hoc.ppt
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Lan overview
Lan overviewLan overview
Lan overview
 
Dcn invited ecoc2018_short
Dcn invited ecoc2018_shortDcn invited ecoc2018_short
Dcn invited ecoc2018_short
 
S.t rajan cjb0912010 ft12
S.t rajan cjb0912010 ft12S.t rajan cjb0912010 ft12
S.t rajan cjb0912010 ft12
 
Thesis-Final-slide
Thesis-Final-slideThesis-Final-slide
Thesis-Final-slide
 
Introduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureIntroduction to Data Center Network Architecture
Introduction to Data Center Network Architecture
 
Dcnintroduction 141010054657-conversion-gate01
Dcnintroduction 141010054657-conversion-gate01Dcnintroduction 141010054657-conversion-gate01
Dcnintroduction 141010054657-conversion-gate01
 

Más de Jason TC HOU (侯宗成)

Más de Jason TC HOU (侯宗成) (8)

A Data Culture in Daily Work - Examples @ KKTV
A Data Culture in Daily Work - Examples @ KKTVA Data Culture in Daily Work - Examples @ KKTV
A Data Culture in Daily Work - Examples @ KKTV
 
Triangulating Data to Drive Growth
Triangulating Data to Drive GrowthTriangulating Data to Drive Growth
Triangulating Data to Drive Growth
 
Design & Growth @ KKTV - uP!ck Sharing
Design & Growth @ KKTV - uP!ck SharingDesign & Growth @ KKTV - uP!ck Sharing
Design & Growth @ KKTV - uP!ck Sharing
 
文武雙全的產品設計 DESIGNING WITH DATA
文武雙全的產品設計 DESIGNING WITH DATA文武雙全的產品設計 DESIGNING WITH DATA
文武雙全的產品設計 DESIGNING WITH DATA
 
Growth @ KKTV
Growth @ KKTVGrowth @ KKTV
Growth @ KKTV
 
Growth 的基石 用戶行為追蹤
Growth 的基石   用戶行為追蹤Growth 的基石   用戶行為追蹤
Growth 的基石 用戶行為追蹤
 
Software-Defined Networking SDN - A Brief Introduction
Software-Defined Networking SDN - A Brief IntroductionSoftware-Defined Networking SDN - A Brief Introduction
Software-Defined Networking SDN - A Brief Introduction
 
OpenStack Framework Introduction
OpenStack Framework IntroductionOpenStack Framework Introduction
OpenStack Framework Introduction
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Data Center Network Multipathing

  • 1. Data Center Network Multipathing Peregrine: An All-Layer-2 Container Computer Network Tzi-cker Chiueh*§, Cheng-Chun Tu*§, Yu-Cheng Wang§, Pai-Wei Wang§, Kai-Wen Li§, Yu-Ming Huang§ *Industrial Technology Research Institute, Taiwan §Computer Science Department, Stony Brook University IEEE Cloud 2012 Leveraging Performance of Multiroot Data Center Networks by Reactive Reroute Adrian S.-W. Tam, Kang Xi H,. Jonathan Chao Department of Electrical and Computer Engineering, Polytechnic Institute of New York Universit 2010 18th IEEE Symposium on High Performance Interconnects Presenter: Jason, Tsung-Cheng, HOU Advisor: Wanjiun Liao May 17th, 2012 1
  • 2. Motivation • Summarize features of the popular multi-root Clos / fat-tree data center topology Take ITRI’s prototype as an example • Surveyed solutions of multipathing • Recap Jin-Jia Chang’s presentation on QCN • Present another solution to multipathing • Compare several multipathing methods 2
  • 3. Agenda • Multi-Root Clos / Fat-Tree Topology • Surveyed Solutions to Multipathing • 802.1Qau – QCN • QCN and Reactive Reroute • Comparison of Multipathing Methods Peregrine: An All-Layer-2 Container Computer Network Tzi-cker Chiueh*§, Cheng-Chun Tu*§, Yu-Cheng Wang§, Pai-Wei Wang§, Kai-Wen Li§, Yu-Ming Huang§ *Industrial Technology Research Institute, Taiwan §Computer Science Department, Stony Brook University IEEE Cloud 2012 3
  • 4. Multi-Root Clos / Fat-Tree • Adopted by various publications – VL2, PortLand, BCube, Elastic Tree, Peregrine • Scale-out, cheap commodity switches • Through fixed maximum switches / hops – If no bouncing, no routing loop • Nearly full bisection, multipathing, symmetric • Possibly tremendous routing table entries • Up and down paths, handled differently • High rate but limited capability, buffer, CPU.. 4
  • 5. High rate but limited capability • All-L2 Ethernet switches • Up to 1 GE or 10 GE links, dozens ports • Limited buffer, hundred K bytes • Limited CPU ability, processing bottleneck • Limited flow table entries, at most dozen Ks • Optimized for fast table lookups • Take Peregrine for example – ITRT’s industrial, commodity production prototype – Others, mostly experimental or high-end 5
  • 6. Topology: Folded Clos cross container A rack 12 racks A container 6
  • 8. Within One Container 5-to-5 per rack But only 4 ports 8
  • 9. DS and RAS • Directory Server – Address association, mgmt, and reuse – Performs IP-MAC lookup, mappings – Updates mappings to end hosts • Route Algorithm Server – Collects entries of the traffic matrix – Runs load-balancing algorithms, based on TM – Distributes routing entries to switches, update DS • Within one container, cross-container unclear • Scalability unclear, VM mobility unclear (Only refers to sth like mobile IP) 9
  • 10. Routing, Balancing, and Tolerance 10
  • 14. ITRI Container Computer Prototype • 6.096m shipping container • 12 server racks, 12 storage racks • All-L2 network, commodity switches • “Folded” Clos topology • Directory Server, Route Algorithm Server • Unclear: Load-balancing algo., VM mobility, DS-RAS scalability, cross-container • In the future: OpenFlow, OpenStack (Currently not using OpenFlow to connect switches… how? unclear) 14
  • 15. Discussions • Spanning tree for multipathing and load- balancing: Simple but limited flexibility • How to plug and play? Scalable? – A new switch leads to reconfiguration – VM migration = affects TM and direct routes? • DS-RAS: a simple version of controller But mechanism, performance unclear • Seems to be trying to combined various advantages: Address mapping, ST multipathing, converged network, folded-Clos 15
  • 16. Agenda • Multi-Root Clos / Fat-Tree Topology • Surveyed Solutions to Multipathing • 802.1Qau – QCN • QCN and Reactive Reroute • Comparison of Multipathing Methods 16
  • 17. Multipathing • VLB: – Traffic splits to intermediate points – Automatically balances load – Ideally great, but subject to PKT reordering • ECMP-hashing – Different hashing functions, big difference – Flow always sticks to one path during transmit • Hedera: – Flow-to-core mapping, flow scheduling – Requires global information, higher complexity 17
  • 18. Multipathing • Spanning Tree / VLAN: (Spain) – Near-static, pre-computation required, but simple – Re-computes when topology changes – Segmentation of resources, limited flexibility • Multipath TCP: – One flow, many parallel paths – VLAN-based routing in publication (like Spain) – Shifts traffic to less congested paths – A new transport mechanism, adaptive – Still with segmentation of resources 18
  • 19. Multipathing References • M. Kodialam, T. V. Kakshman, S. Sengupta, “Efficient and Robust Routing of Highly Variable Traffic”, HotHets, 2004. • R. Zhang-Shen and N. McKeown “Designing a Predictable Internet Backbone Network”, Third Workshop on Hot Topics in Networks (HotNets-III), November 2004. • A. Greenberg et al., “VL2: A Scalable and Flexible Data Center Network”, ACM SIGCOMM 2009. • M YSORE, R. N., PAMPORIS, A., FARRINGTON, N., H UANG, N., MIRI , P., R ADHAKRISHNAN, S., S UBRAMANYA, V., AND VAHDAT, A. “PortLand: A Scalable, Fault-Tolerant Layer 2 Data Center Network Fabric.” In Proceedings of ACM SIGCOMM, 2009. • M. Al-Fares, et. al., “Hedera: Dynamic Flow Scheduling for Data Center Network”, USENIX NSDI 2010. • J. Mudigonda, P. Yalagandula, M. Al-Fares, and J. C. Mogul. “SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies.” In USENIX NSDI, April 2010. • C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik, and M. Handley. “Data center networking with multipath TCP.” In HotNets, 2010. 19
  • 20. Agenda • Multi-Root Clos / Fat-Tree Topology • Surveyed Solutions to Multipathing • 802.1Qau – QCN • QCN and Reactive Reroute • Comparison of Multipathing Methods Data center transport mechanisms: Congestion control theory and IEEE standardization M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman, Communication, Control, and Computing, 2008 46th Annual Allerton Conference on AF-QCN: Approximate fairness with quantized congestion notification for multitenanted data centers A. Kabbani, M. Alizadeh, M. Yasuda, R. Pan, and B. Prabhakar, B. In High Performance Interconnects (HOTI), 2010, IEEE 18th Annual Symposium on 20
  • 21. Data Center Bridging Task Group • Converged network – LAN: no priority control Qbb: Priority-based Flow Control – FCoE (SAN): no congestion control Qau: Quantized Congestion Notification • Need to survey more on converged network – Respective features and requirements – Could be a very important trend 21
  • 22. QCN • CP: Congestion Point – A switch monitors queue, Q, Qeg, Qold – Samples and sends Fb msg to RP – Fb a combination of (queue, rate) excess – Targets for no PKT loss • RP: Reaction Point – A host with Rate Limiter, Counter, and Timer – Retries for more BW like AIMD – Decreases according to Fb msg – Counter and Timer both controls RL 22
  • 23. QCN 23
  • 24. QCN 24
  • 25. AF-QCN 25
  • 26. Modify Fb Msg to Imply More 26
  • 27. Agenda • Multi-Root Clos / Fat-Tree Topology • Surveyed Solutions to Multipathing • 802.1Qau – QCN • QCN and Reactive Reroute • Comparison of Multipathing Methods Leveraging Performance of Multiroot Data Center Networks by Reactive Reroute Adrian S.-W. Tam, Kang Xi H,. Jonathan Chao Department of Electrical and Computer Engineering, Polytechnic Institute of New York Universit 27
  • 28. Exploit Multipath Property • Use QCN to further leverage redundancy – Per-flow CN adjusts BW: Spectral – Relocates flows among paths: Spatial – Both mitigates congestions • Multiroot, Clos / fat-tree topology – Upward: destination based, deterministic – Downward: could be randomized or rerouted • Hashed ECMP: Distributes flow population • Flow-reroute: Balancing congested links 28
  • 29. Reactive Reroute • Edge switches counts received QCNs-Ports – Only edge switches will reroute, consider enough – Only for upward PKTs, not for downward • Reroutes flows (elephant && congested), detects by counting QCNs in a short period • Three reroute methods: – Uniform random – Min. prob. of congestion (conditional prob.) – Weighted of above two • Freezes a rerouted flow to avoid flapping 29
  • 30. Algorithm Pseudo Code Only when within a short period 30
  • 31. NS-3 Simulation • Simulation for 1 second • Also a TCP simulation 31
  • 33. Outlier Latency • Very large flows are throttled by L2 congestion control, thus with large latency • 60% within 1ms, but in average it takes 15ms! 33
  • 34. Discussion • Why Min. reroute is always worse? – Some flows’ path overlap in the beginning – Edge switches have no global information – Receives QCN from the same (port, agg) Synchronized reroute • Operates a centralized controller? – Authors argue that gain is very small – But they do not present more on the “outliers” – The flows with longest latencies, the larger – The larger flows could be some vital connections 34
  • 35. Discussion • L2 congestion control protects TCP over UDP • No PKT loss, almost no incast problem • Out-of-order problem is more severe for UDP • However, because switch buffer is tightly monitored, the number of out-of-order PKTs is limited at most as (5nr/s) (n: buffer size) (r: sending rate) (s: link rate) • Freezes a rerouted flow: Also limits reordering 35
  • 36. Agenda • Multi-Root Clos / Fat-Tree Topology • Surveyed Solutions to Multipathing • 802.1Qau – QCN • QCN and Reactive Reroute • Comparison of Multipathing Methods Comparative Evaluation of CEE-based Switch Adaptive Routing Daniel Crisan, Mitch Gusat, Cyriel Minkenberg, 2nd Workshop on Data Center - Converged and Virtual Ethernet Switching (DC CAVES), 2010 36
  • 37. Multipathing Methods • Deterministic, static, or preconfigured – Single fixed path – VLAN-based, multiple fixed paths, ST-per-VLAN • Oblivious, randomized – Hashed by headers – Split to intermediaries • Reactive, switch adaptive routing • Controller-enabled centralized scheduling 37
  • 38. Comparison • Deterministic, static, or preconfigured – Simple, no re-ordering • Oblivious, randomized, good when… – Single prio., symmetric traffic • Reactive, switch adaptive routing, realistic… – Multiple prio., asymmetric • Controller-enabled centralized scheduling – Large input set, higher complexity – Controller hard to implement, high cost low gain? • Convergence and virtualization are trends 38
  • 39. Discussion • Data center traffic patterns are evolving and unknown a priori in many cases • Justifies multiple routing / balancing schemes Currently no single killer solution • Should be able to switch between modes Reactive-Adaptive and Randomized • Role of controller still to be optimized – Could be useful for criti cal flows / situation – Detect and react in slower manner – Not ideal for dynamic fast reaction 39
  • 40. Reference • Tzi-cker Chiueh, Cheng-Chun Tu, Yu-Cheng Wang, Pai-Wei Wang, Kai-Wen Li, Yu-Ming Huang , “Peregrine: An All-Layer-2 Container Computer Network”, IEEE Cloud 2012 • M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman, “Data center transport mechanisms: Congestion control theory and IEEE standardization,” Communication, Control, and Computing, 2008 46th Annual Allerton Conference on • A. Kabbani, M. Alizadeh, M. Yasuda, R. Pan, and B. Prabhakar. “AF-QCN: Approximate fairness with quantized congestion notification for multitenanted data centers”, In High Performance Interconnects (HOTI), 2010, IEEE 18th Annual Symposium on • Adrian S.-W. Tam, Kang Xi H., Jonathan Chao , “Leveraging Performance of Multiroot Data Center Networks by Reactive Reroute”, 2010 18th IEEE Symposium on High Performance Interconnects • Daniel Crisan, Mitch Gusat, Cyriel Minkenberg, “Comparative Evaluation of CEE-based Switch Adaptive Routing”, 2nd Workshop on Data Center - Converged and Virtual Ethernet Switching (DC CAVES), 2010 40