7. FabricPath的设计目标
Switching FabricPath Routing
Minimal Configuration Configuration Intense
Plug & Play Configured Learning
Auto Discovery Configured Discovery
Auto Learning Plan & Play
Flat Addressing Fast Convergence
Spanning Tree Protocol Multiple Paths
(STP)
Load Balancing
Slow Convergence
Multiple Multicast Trees
Single Path
Hierarchical Forwarding
Edge-to-Root Rigid
Design Any-to-any Flexible
Design
Single Multicast Tree
Highly Scalable
Constrained
Scaleability
8. FabricPath 封装结构
16-Byte MAC-in-MAC Header
Classical Ethernet Frame DMAC SMAC 802.1Q Etype Payload CRC
Original CE Frame
Cisco FabricPath Outer Outer FP
CRC
DA SA Tag DMAC SMAC 802.1Q Etype Payload
Frame (48) (48) (32)
(new)
6 bits 1 1 2 bits 1 1 12 bits 8 bits 16 bits 16 bits 10 bits 6 bits
OOO/DL
RSVD
Endnode ID Endnode ID Sub
U/L
I/G
Switch ID Port ID Etype Ftag TTL
(5:0) (7:6) Switch ID
Switch ID – Unique number identifying each FabricPath switch
Sub-Switch ID – Identifies devices/hosts connected via VPC+
Port ID – Identifies the destination or source interface
Ftag (Forwarding tag) – Unique number identifying topology and/or
multidestination distribution tree
TTL – Decremented at each switch hop to prevent frames looping infinitely
12. FabricPath MAC 转发表
Edge switches maintain both MAC address table and Switch ID table
Ingress switch uses MAC table to determine destination Switch ID
Egress switch uses MAC table to determine output switchport
S10 S20 S30 S40
FabricPath
MAC Table on S100
MAC IF/SID
Local MACs point
to switchports A e1/1
S100 S101 FabricPath S200
B e1/2
Remote MACs point
C S101
to Switch IDs
D S200
MAC A MAC B MAC C MAC D
13. FabricPath Routing 转发表
FabricPath IS-IS manages Switch ID (routing) table
All FabricPath-enabled switches automatically assigned Switch ID (no
user configuration required)
Algorithm computes shortest (best) paths to each Switch ID based on
link metrics
Equal-cost paths supported between FabricPath switches
S10 S20 S30 S40
FabricPath
Routing Table on S100
Switch IF
One ‘best’ path
to S10 (via L1) S10 L1
S20 L2
S30 L3 L1 L2 L3 L4
S40 L4
Four equal-cost
S101 L1, L2, L3, L4
paths to S101
… … FabricPath
S200 L1, L2, L3, L4
S100 S101 S200
14. FabricPath Routing 转发表项构建
Switch IF Switch IF
S20 L1,L5,L9 S10 L4,L8,L12
S30 L1,L5,L9 S20 L4,L8,L12
S40 L1,L5,L9 S30 L4,L8,L12
S10 S20 S30 S40
S100 L1 S100 L4
S101 L5 S101 L8
… … … …
S200 L9 S200 L12
L5 L6 L7 L8
L1 L2 L3 L4 L9 L10 L11 L12
S100 S101 FabricPath S200
Switch IF Switch IF
S10 L1 S10 L9
S20 L2 S20 L10
S30 L3 S30 L11
S40 L4 MAC A MAC B MAC C MAC D S40 L12
S101 L1, L2, L3, L4 S100 L9, L10, L11, L12
… … S101 L9, L10, L11, L12
S200 L1, L2, L3, L4 … …
15. Putting It All Together – Host A to Host B
(1) Broadcast ARP Request
Root for Root for
Multidestination Tree 1 Tree 2
Trees on Switch 10 S10 S20 S30 S40
Tree IF
DSID→FF
Ftag → 1 L1,L5,L9 Ftag→1
2 L9
SSID→100 DSID→FF
Ftag→1
DMAC→FF L5 L6 L7 L8
SSID→100
SMAC→A
Multidestination L1 L2 L3 L4 DMAC→FF
Payload L9 L10 L11 L12
Trees on Switch 100 SMAC→A
Tree IF Payload
Broadcast → 1 L1,L2,L3,L4 S100 S101 FabricPath S200
2 L4 Multidestination
Trees on Switch 200
FabricPath Payload
Tree IF
MAC Table on S100 DMAC→FF
Ftag → 1 L9 SMAC→A
MAC IF/SID SMAC→A
2 L9,L10,L11,L12 DMAC→FF
A e1/1 (local) Payload
MAC A MAC B
FabricPath
MAC Table on S200
MAC IF/SID
Don’t learn MACs in
flood frames
Learn MACs of directly-connected
devices unconditionally
16. Putting It All Together – Host A to Host B
(2) Unicast ARP Reply
Multidestination
Trees on Switch 10 S10 S20 S30 S40
Tree IF
Ftag → 1 L1,L5,L9
2 L9
DSID→MC1 DSID→MC1
Ftag→1 Ftag→1
L5 L6 L7 L8
SSID→200 SSID→200
Multidestination DMAC→A DMAC→A
L1 L2 L3 L4 L9 L10 L11 L12
Trees on Switch 100 SMAC→B SMAC→B
Payload Payload
Tree IF
Ftag → 1 L1,L2,L3,L4 S100 S101 FabricPath S200
2 L4 Multidestination
Trees on Switch 200
FabricPath DMAC→A
Payload Tree IF
MAC Table on S100
SMAC→B Unknown → 1 L9 SMAC→B
MAC IF/SID
Payload
DMAC→A 2 L9,L10,L11,L12
A→ A e1/1 (local)
MAC A MAC B
B S200 (remote)
FabricPath
MAC Table on S200
MAC IF/SID
If DMAC is known, then A→
learn remote MAC B e12/2 (local)
17. Putting It All Together – Host A to Host B
(3) Unicast Data
FabricPath Routing
Table on S30 S10 S20 S30 S40
Switch IF
… …
S200 → S200 L11
DSID→200
DSID→200
Ftag→1
Ftag→1
SSID→100 L5 L6 L7 L8
SSID→100
DMAC→B
FabricPath Routing L1 L2 L3 L4 DMAC→B
L9 L10 L11 L12
Table on S100 SMAC→A
SMAC→A
Payload
Switch IF Hash Payload
S10 L1 S100 S101 FabricPath S200
S20 L2 FabricPath Routing
S30 L3 Table on S30
S40 L4 Switch IF Payload
DMAC→B
S101 L1, L2, L3, L4 … … SMAC→A
SMAC→A
… …
Payload S200 → S200 – DMAC→B
S200 → S200 L1, L2, L3, L4
MAC A MAC B
FabricPath
FabricPath MAC Table on S200
MAC Table on S100
MAC IF/SID
MAC IF/SID
A S100 (remote)
A e1/1 (local)
B→ B e12/2 (local)
B→ B S200 (remote)
18. 基于会话的MAC学习
FabricPath
MAC Table on S300
MAC IF/SID
B S200 (remote)
S300
C e7/10 (local)
FabricPath MAC C
S100
MAC Table on S100
MAC IF/SID
A e1/1 (local)
B S200 (remote)
FabricPath
FabricPath MAC Table on S200
Core S200
MAC
A
IF/SID
S100 (remote)
MAC A B e12/1(local)
C S300 (remote)
MAC B
19. Conversational MAC Learning
优化资源利用率 – Learning only the MAC addresses required
250 250
MACs MACs
MAC IF
500 500
MACs MACs
MAC IF
L2 Fabric
B 2/1
STP S11
B
Domain
MAC IF
500 500 C 3/1
MACs MACs
A S11
250 250
MACs MACs
A C
ALL MACs needs to be Local MAC: Source-MAC Learning only
learn on EVERY Switch happen to traffic received on CE Ports
Large L2 domain and
Remote MAC: Source-MAC for traffic
virtualization present
challenges to MAC received on FabricPath Ports are only
Table scalability learned if Destination-MAC is already
known as Local
20. Architectural Approach for MSDC
Scale-Up Spine Lean Core
CLOS Scale-Out Leaf Smart Edge
Same node type used in High density spine Layer-1.5 Spine
all roles (Spine and node (Dumb Core)
Edge)
Fine Grain Redundancy Smaller fixed leaf Intelligent Edge
Additional density Fewer control
provided through planes than pure
density of node or
additional layers Clos
21. Fabricpath 构建通用网络交换平台
POD 1 POD 2 POD 3 PODS 1-3
VLANs 100-199 VLANs 200-299 VLANs 300-399 VLANs 100-399
22. 大规模数据中心的通用网络交换平台
--网络对业务部署灵活性的支持
模块化 易扩展
网络带宽及延时的一致性 与服务器所处位置无关
业务的快速部署 计算资源的灵活移动和调配
Any service on any server, at any time!!!
可扩展性 业务/集群的扩展不再受制于网络
服务器的使用效率 服务器重复利用
可管理性 即插即用,配置最简化,人工干预少
可靠性 单点故障对整体业务的影响
23. 从“路由”回归“交换”
--中小型数据中心的交换网络
Nexus 7000/5000
Virtualized chassis
Nexus 5000
+
Nexus 2000 Fabric Extender
=
• Turn your network into a Switch
• 关键技术:远端扩展模块,FEX as TOR
24. FEX Terminology
FEX can be connected to a parent switch Parent switch
in three ways:
single attached without any vPC running on the
Fabric Links
parent switch
single attached with vPC running on the parent NIFs
switch
dual attached in vPC mode
HIFs
vPC vPC
Primary Secondary vPC vPC
Primary Secondary
Fabric Links
Fabric Links
NIFs
NIFs
vPC 1 vPC 2
HIFs HIFs
25. FEX Inner Functioning
Inband Management Model
software image,
configuration
Fabric extender is discovered by
switch using an L2 Satellite
N5k01
Discover Protocol (SDP) that is run
on the uplink port of fabric extender
1,2,3,4
Core Switch checks software image
Core Switch pushes programming
data to Fabric Extender
1-48 GigE