SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
Cisco	
  Userspace	
  NIC	
  (usNIC)

	
  

Jeff	
  Squyres	
  

Cisco	
  Systems,	
  Inc.	
  
November	
  7,	
  2013	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

1
Yes,	
  we	
  sell	
  servers	
  now	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

2
Record-­‐seNng	
  
Intel	
  I CS	
  servers	
  
Cisco	
  Uvy	
  Bridge	
  
1U	
  and	
  2U	
  servers	
  

Ultra	
  low	
  
Cisco	
  2	
  x	
  10Gb	
  VIC	
  
latency	
  Ethernet	
  

Yes,	
  
really!	
  

40Gb	
  top-­‐of-­‐rack	
  
Cisco	
  10/40Gb	
  
and	
  core	
  witches	
  
switches	
  
Nexus	
  s
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

3
Industry-­‐leading	
  compute	
  without	
  compromise	
  

Rack	
  

HPC	
  performance	
  

4	
  socket	
  +	
  giant	
  memory
	
  

UCS	
  C240	
  M3	
  

Perfect	
  as	
  HPC	
  cluster	
  head	
  nodes	
  
or	
  IO	
  nodes	
  (2	
  socket)	
  

UCS	
  C420	
  M3	
  

4-­‐socket	
  rack	
  server	
  for	
  large-­‐memory	
  
compute	
  workloads	
  

UCS	
  C220	
  M3	
  

Blade	
  

Ideal	
  for	
  HPC	
  compute-­‐intensive	
  
applicaXons	
  (2	
  socket)	
  

UCS	
  B200	
  M3	
  

Blade	
  form	
  factor,	
  2-­‐socket	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

UCS	
  B420	
  M3	
  

4-­‐socket	
  blade	
  for	
  	
  
large-­‐memory	
  compute	
  workloads	
  

Cisco	
  UCS:	
  	
  Many	
  Server	
  Form	
  Factors,	
  One	
  System	
  
4	
  

Cisco Public

4
Worldwide	
  X86	
  Server	
  Blade	
  Market	
  Share	
  

UCS	
  impacBng	
  growth	
  of	
  
established	
  vendors	
  like	
  HP	
  
Legacy	
  offerings	
  flat-­‐lining	
  
or	
  in	
  decline	
  
Cisco	
  growth	
  out-­‐pacing	
  the	
  market	
  

UCS	
  #2	
  and	
  	
  
climbing	
  

Market	
  AppeXte	
  	
  
for	
  InnovaXon	
  Fuels	
  
UCS	
  Growth	
  
Customers	
  have	
  shiMed	
  19.3%	
  of	
  
the	
  global	
  x86	
  blade	
  server	
  market	
  
to	
  Cisco	
  and	
  over	
  26%	
  in	
  the	
  
Americas	
  (Source:	
  	
  IDC	
  Worldwide	
  Quarterly	
  

Source:	
  	
  IDC	
  Worldwide	
  Quarterly	
  Server	
  Tracker,	
  Q1	
  2013	
  Revenue	
  Share,	
  May	
  2013	
  

Server	
  Tracker,	
  Q1	
  2013	
  Revenue	
  Share,	
  May	
  
2013)	
  

Demand	
  for	
  Data	
  Center	
  InnovaBon	
  Has	
  Vaulted	
  Cisco	
  Unified	
  CompuBng	
  System	
  	
  
(UCS)	
  to	
  the	
  #2	
  Leader	
  in	
  the	
  Fast-­‐Growing	
  Segment	
  of	
  the	
  x86	
  Server	
  Market	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

5
16	
  world	
  records	
  

Best	
  CPU	
  
Performance	
  

Best	
  
VirtualizaXon	
  &	
  
Cloud	
  
Performance	
  

8	
  world	
  records	
  

Best	
  Database	
  
Performance	
  

Best	
  Enterprise	
  
ApplicaXon	
  
Performance	
  

Best	
  Enterprise	
  
Middleware	
  
Performance	
  

Best	
  HPC	
  
Performance	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

9	
  world	
  records	
  
18	
  world	
  records	
  
14	
  world	
  records	
  
15	
  world	
  records	
  
Cisco Public

6
One	
  wire	
  to	
  rule	
  them	
  all:	
  
•  Commodity	
  traffic	
  (e.g.,	
  ssh)	
  
•  Cluster	
  /	
  hardware	
  management	
  
•  File	
  system	
  /	
  IO	
  traffic	
  
•  MPI	
  traffic	
  
10G	
  or	
  40G	
  
with	
  real	
  QoS	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

7
Low	
  latency,	
  high	
  density	
  10	
  /	
  40Gb	
  switches	
  
Low	
  latency	
  

High	
  density
	
  

Nexus	
  3548	
  
190ns	
  port-­‐to-­‐port	
  latency	
  (L2	
  and	
  L3)	
  
Created	
  for	
  HPC	
  /	
  HFT	
  
48	
  10Gb	
  /	
  12	
  40Gb	
  ports	
  

Nexus	
  6004	
  
1us	
  port-­‐to-­‐port	
  latency	
  
384	
  10Gb	
  /	
  96	
  40Gb	
  ports	
  

Cisco	
  Nexus:	
  Years	
  of	
  experience	
  rolled	
  into	
  dependable	
  soluBons	
  
8	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

8
Spine	
  

Leaf	
  

CharacterisXcs	
  
• 
• 
• 
• 
• 
• 

3	
  Hops	
  
Low	
  OversubscripXon	
  –	
  Non-­‐Blocking	
  
<	
  ~3.5	
  usecs	
  depending	
  on	
  config	
  and	
  workload	
  
10G	
  or	
  40G	
  Capable	
  
Spine:	
  4	
  to	
  16	
  Wide	
  
Leaf:	
  Determined	
  by	
  Spine	
  Density	
  

Spine	
  -­‐	
  Leaf	
  

Port	
  Scale	
  

Latency	
  

Spines	
  

Leafs	
  

10G	
  Fabric	
  

6004	
  -­‐	
  6001	
  

18,432	
  x	
  10G	
  3:1	
  

~	
  3	
  usecs	
  Cut-­‐through	
  

16	
  

384	
  

40G	
  Fabric	
  

6004	
  -­‐	
  6004	
  

7,680	
  x	
  40G	
  5:1	
  

~	
  3	
  usecs	
  Cut-­‐through	
  

16	
  

96	
  

Mixed	
  Fabric	
  

6004	
  -­‐	
  6001	
  

4,680	
  x	
  10G	
  3:1	
  

~	
  3	
  usecs	
  S&F	
  

4	
  

96	
  

10G	
  Fabric	
  

6004	
  -­‐	
  3548	
  

12,288	
  x	
  10G	
  3:1	
  

~	
  1.5	
  usecs	
  Cut-­‐through	
  

16	
  

384	
  

40G	
  Fabric	
  

6004	
  -­‐	
  3548	
  

1,152	
  x	
  40G	
  1:1	
  

~	
  1.5	
  usecs	
  Cut-­‐through	
  

6	
  

96	
  

Mixed	
  Fabric	
  

6004	
  -­‐	
  3548	
  

3,072	
  x	
  10G	
  3:1	
  

~	
  1.5	
  usecs	
  S&F	
  

4	
  

96	
  

…many	
  other	
  configuraBons	
  are	
  also	
  possible	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

9
CharacterisXcs	
  
• 
• 
• 

Spine2	
  

• 
• 

Spine1	
  

3	
  Hops	
  Pod	
  –	
  5	
  hops	
  DC	
  east-­‐west	
  traffic	
  
Low	
  OversubscripXon	
  –	
  Non-­‐Blocking	
  
<	
  ~3.5	
  usecs	
  depending	
  on	
  config	
  and	
  
workload	
  
10G	
  or	
  40G	
  Capable	
  
Two	
  spine	
  layers	
  

Leaf	
  

Spine2-­‐Spine1-­‐Leaf	
  

Port	
  Scale	
  

Latency	
  

Spine2	
  

Spine1	
  

Leafs	
  

10G	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  6001	
  

55,296	
  x	
  10G	
  3:1	
  

~	
  3-­‐5	
  usecs	
  Cut-­‐through	
  

48	
  

16	
  x	
  6	
  

192	
  

40G	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  6004	
  

23,040	
  x	
  40G	
  5:1	
  

~	
  3-­‐5	
  usecs	
  Cut-­‐through	
  

48	
  

16	
  

48	
  

Mixed	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  6001	
  

18,432	
  x	
  10G	
  3:1	
  

~	
  3-­‐5	
  usecs	
  S&F	
  

32	
  

4	
  x	
  8	
  

48	
  

10G	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  3548	
  

24,576	
  x	
  10G	
  2:1	
  

~	
  1.5-­‐3.5	
  usecs	
  Cut-­‐through	
  

32	
  

16	
  x	
  4	
  

192	
  

40G	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  3548	
  

2,304	
  x	
  40G	
  1:1	
  

~	
  1.5-­‐3.5	
  usecs	
  Cut-­‐through	
  

24	
  

6	
  x	
  8	
  

48	
  

Mixed	
  Fabric	
  

6004	
  -­‐	
  6004	
  -­‐	
  3548	
  

9,216x	
  10G	
  2:1	
  

~	
  1.5-­‐3.5	
  usecs	
  S&F	
  

24	
  

6	
  x	
  8	
  

48	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

10
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

11
•  Direct	
  access	
  to	
  NIC	
  hardware	
  from	
  

Linux	
  userspace	
  

OperaXng	
  System	
  bypass	
  
Via	
  the	
  Linux	
  Verbs	
  API	
  (UD)	
  

•  UXlizes	
  Cisco	
  Virtual	
  Interface	
  Card	
  

(VIC)	
  for	
  ultra-­‐low	
  Ethernet	
  latency	
  
2nd	
  generaXon	
  80Gbps	
  Cisco	
  ASIC	
  
2	
  x	
  10Gbps	
  Ethernet	
  ports	
  
2	
  x	
  40Gbps	
  coming	
  …soon…	
  
PCI	
  and	
  mezzanine	
  form	
  factors	
  

•  Half-­‐round	
  trip	
  (HRT)	
  ping-­‐pong	
  

latencies	
  (Intel	
  E5-­‐2690	
  v2	
  servers):	
  
Raw	
  back	
  to	
  back:

	
  1.57μs	
  

MPI	
  back	
  to	
  back:

	
  1.85μs	
  

Through	
  MPI+N3548: 	
  2.05μs	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

These	
  numbers	
  
keep	
  going	
  down	
  

Cisco Public

12
TCP/IP	
  

usNIC	
  

ApplicaXon	
  

Userspace	
  

ApplicaXon	
  

Userspace	
  sockets	
  
library	
  

Userspace	
  verbs	
  library	
  

Kernel	
  
TCP	
  stack	
  

Bootstrapping	
  
and	
  setup	
  

General	
  Ethernet	
  driver	
  

Verbs	
  IB	
  core	
  

Cisco	
  VIC	
  driver	
  

Cisco	
  USNIC	
  
driver	
  

Cisco	
  VIC	
  hardware	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Send	
  and	
  
receive	
  
fast	
  path	
  

Cisco	
  VIC	
  hardware	
  
Cisco Public

13
MPI	
  
MPI	
  directly	
  
injects	
  
L2	
  frames	
  
	
  to	
  the	
  network	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Userspace	
  verbs	
  library	
  
Cisco	
  VIC	
  hardware	
  

MPI	
  receives	
  
L2	
  frames	
  
directly	
  from	
  
the	
  VIC	
  

Cisco Public

14
MPI	
  process	
  
MPI	
  process	
  

x86	
  Chipset	
  VT-­‐d	
  
I/O MMU

VIC	
  
SR-IOV NIC

QP
QP
Queue pair

Classifier	
  
Inbound	
  
L2	
  frames	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Outbound	
  
L2	
  frames	
  
Cisco Public

15
Physical	
  FuncXon	
  (PF)	
  
MAC	
  address:	
  aa:bb:cc:dd:ee:ff	
  
	
  
QP	
   QP	
  
VF	
  
QP	
   QP	
  
VF	
  

	
  
VF	
  
	
  
	
  
	
  
VF	
  

Physical	
  port	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

VF	
  

VF	
  

VIC	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  

Physical	
  FuncXon	
  (PF)	
  

MAC	
  address:	
  a	
  a:bb:cc:dd:ee:fe	
  
QP	
   QP	
  
VF	
  
QP	
   QP	
  
VF	
  

	
  
VF	
  
	
  
	
  
	
  
VF	
  

VF	
  

VF	
  

Physical	
  port	
  

Cisco Public

16
MPI	
  process	
  
PF	
  (MAC)	
  
	
  
VF	
  
VF	
  
VF	
  
	
  
QP	
   QP	
  
	
  
VF	
  
VF	
  
VF	
  
	
  
Physical	
  port	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

VIC	
  

	
  
	
  
	
  
	
  

PF	
  (MAC)	
  
	
  
VF	
  
VF	
  
VF	
  
	
  
QP	
   QP	
  
	
  
VF	
  
VF	
  
VF	
  
	
  
Physical	
  port	
  

Intel IO MMU

MPI	
  process	
  

Cisco Public

17
•  Used	
  for	
  physical	
  ßà	
  virtual	
  memory	
  translaXon	
  
•  usnic	
  verbs	
  driver	
  programs	
  (and	
  deprograms)	
  the	
  IOMMU	
  
	
  

VIC	
  

Virtual	
  

Virtual	
  

Intel IO MMU
Physical	
  

Virtual	
  

Physical	
  

Userspace	
  
process	
  

RAM	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

18
•  For	
  the	
  purposes	
  of	
  this	
  talk,	
  let’s	
  assume	
  that	
  each	
  physical	
  port	
  has	
  

one	
  Linux	
  ethX	
  device	
  

•  Each	
  ethX	
  device	
  corresponds	
  to	
  a	
  PF	
  
•  Each	
  usnic_Y	
  device	
  corresponds	
  to	
  an	
  ethX	
  device2	
  

VIC	
  
Physical	
  port	
  0	
  
eth4	
  /	
  usnic_0	
  

Physical	
  port	
  1	
  
eth5	
  /	
  usnic_1	
  

Physical	
  port	
  

Physical	
  port	
  

(fiber)	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

19
Machine (128GB)

NUMANode P#0 (64GB)

Socket P#0

PCI 8086:1521

L3 (20MB)

eth0

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#0

PU P#1

PU P#2

PU P#3

PU P#4

PU P#5

PU P#6

PU P#7

PU P#16

PU P#17

PU P#18

PU P#19

PU P#20

PU P#21

PU P#22

PU P#23

PCI 8086:1521
eth1

PCI 8086:1521
eth2

PCI 8086:1521
eth3

PCI 1137:0043
eth4

Intel	
  Xeon	
  E5-­‐2690	
  (“Sandy	
  Bridge”)	
  
2	
  sockets,	
  8	
  cores,	
  64GB	
  per	
  socket	
  

usnic_0

PCI 1137:0043
eth5

	
  
VIC	
  
ports	
  

usnic_1

PCI 102b:0522

NUMANode P#1 (64GB)

Socket P#1

PCI 1000:005b

L3 (20MB)

sda

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

eth6

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

usnic_2

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#8

PU P#9

PU P#10

PU P#11

PU P#12

PU P#13

PU P#14

PU P#15

PU P#24

PU P#25

PU P#26

PU P#27

PU P#28

PU P#29

PU P#30

PU P#31

PCI 1137:0043

PCI 1137:0043
eth7

	
  
VIC	
  
ports	
  

usnic_3

Indexes: physical

© 2013 Cisco and/or its affiliates. All rights reserved.

Date: Thu Nov 7 10:58:23 2013

Cisco Public

20
PU P#23
eth3

Machine (128GB)

PCI 1137:0043

NUMANode P#0 (64GB)

eth4

Socket P#0
L3 (20MB)

PCI 8086:1521
eth0

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

usnic_0
L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

PCI 1137:0043
L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

PU P#0

PU P#1

PU P#2

PU P#3

PU P#4

PU P#5

PU P#6

PU P#16

PU P#17

PU P#18

PU P#19

PU P#20

PU P#21

PU P#22

Core P#7

eth5

PCI 8086:1521
eth1

PCI 8086:1521
eth2

PU P#7
PU P#23

usnic_1

PCI 8086:1521
eth3

PCI 1137:0043

PCI 102b:0522

eth4

usnic_0

PCI 1137:0043
eth5

PCI 1000:005b

usnic_1

sda

PCI 102b:0522

NUMANode P#1 (64GB)

L2 (256KB)

PCI 1137:0043

Socket P#1
L3 (20MB)

L1d (32KB)

eth6

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1i (32KB) (32KB)
L1d

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

usnic_2
L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

PCI 1137:0043
Core P#7

PU
PU P#15 P#8

PU P#9

PU P#10

PU P#11

PU P#12

PU P#13

PU P#14

PU P#25

PU P#26

PU P#27

PU P#28

PU P#29

PU P#30

L1i (32KB)

Core P#7

PU P#24

PU P#15

eth7

PCI 1000:005b
sda

PCI 1137:0043
eth6

usnic_2

PCI 1137:0043
eth7

PU P#31

PU P#31

usnic_3

usnic_3
Indexes: physical

© 2013 Cisco and/or its affiliates. All rights reserved.

Date: Thu Nov 7 10:58:23 2013

Cisco Public

21
ApplicaXon	
  
Open	
  MPI	
  layer	
  (OMPI)	
  

Point-­‐to-­‐point	
  messaging	
  layer	
  (PML)	
  
Byte	
  Transfer	
  Layer	
  (BTL)	
  
OperaXng	
  System	
  
Hardware	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

22
MPI_Send	
  /	
  MPI_Recv	
  (etc.)	
  

OB1	
  PML	
  
usnic	
  BTL	
  
/dev/usnic_0	
  

usnic	
  BTL	
  
/dev/usnic_1	
  

VIC	
  0	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

usnic	
  BTL	
  
/dev/usnic_2	
  

usnic	
  BTL	
  
/dev/usnic_3	
  

VIC	
  1	
  

Cisco Public

23
•  Byte	
  Transfer	
  Layer	
  
•  Point-­‐to-­‐point	
  transfer	
  plugins	
  in	
  OMPI	
  layer	
  
•  No	
  protocol	
  is	
  assumed	
  /	
  required	
  
•  “usnic”	
  BTL	
  	
  

usnic	
  BTL	
  
/dev/usnic_2	
  

•  Uses	
  unreliable	
  datagram	
  (UD)	
  verbs	
  
•  Handles	
  all	
  fragmentaXon	
  and	
  re-­‐assembly	
  (vs.	
  PML)	
  
•  Retransmissions	
  and	
  ACKs	
  handled	
  in	
  sovware	
  
•  Sliding	
  window	
  retransmission	
  scheme	
  
•  Direct	
  inject	
  /	
  direct	
  receive	
  of	
  L2	
  Ethernet	
  frames	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

24
•  Priority	
  queue	
  for	
  small	
  and	
  control	
  packets	
  
•  Data	
  queue	
  for	
  up	
  to	
  MTU-­‐sized	
  data	
  packets	
  

Priority	
  
QP	
  

Data	
  
QP	
  

CQ	
  

•  Each	
  module	
  has	
  two	
  UD	
  queue	
  pairs	
  

CQ	
  

•  One	
  BTL	
  module	
  for	
  each	
  usNIC	
  verbs	
  device	
  

•  Each	
  QP	
  has	
  its	
  own	
  CQ	
  
•  QPs	
  may	
  or	
  may	
  not	
  be	
  on	
  same	
  VF	
  
•  Overall	
  BTL	
  glue	
  polls	
  CQs	
  for	
  each	
  device	
  
•  First,	
  priority	
  CQs	
  
•  Then	
  data	
  CQs	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

25
•  “raw”	
  latency	
  (no	
  MPI,	
  no	
  verbs)	
  is	
  1.57μs	
  
•  MPI	
  latency	
  back-­‐to-­‐back	
  on	
  Sandy	
  Bridge	
  1.85μs	
  
•  Verbs	
  responsible	
  for	
  about	
  80ns	
  of	
  the	
  difference	
  (not	
  related	
  to	
  MPI	
  API)	
  
•  All	
  the	
  rest	
  of	
  OMPI	
  is	
  only	
  about	
  200ns	
  
Raw:	
  1.57μs	
  

MPI:	
  200ns	
  

Verbs:	
  80ns	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

26
•  Deferred	
  and	
  piggy-­‐backed	
  ACKs	
  
Process	
  A	
  

Msg	
  

Process	
  B	
  

ACK	
  N	
  

Msg	
  
Msg	
  
Msg	
  

Immediate	
  

Deferred	
  

Time	
  

ACK	
  N+2	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Msg	
  
Msg	
  
Msg	
  
Msg+ACK	
  N+2	
  

Deferred	
  +	
  
piggybacked	
  
Cisco Public

27
•  Host	
  writes	
  WQ	
  structure	
  
Writes	
  index	
  to	
  VIC	
  via	
  PIO	
  
VIC	
  reads	
  WQ	
  descriptor	
  
VIC	
  reads	
  buffer	
  from	
  RAM	
  
VIC	
  sends	
  buffer	
  from	
  RAM	
  
WQ	
  
descriptor	
  

Host	
  

Write	
  WQ
	
  in
	
  
Read	
  WQ

VIC	
  
dex	
  

ket	
  
Read	
  pac

VIC	
  now	
  has	
  
buffer	
  address	
  
Send	
  on	
  wire	
  

Buffer	
  to	
  
send	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

28
•  Host	
  writes	
  WQ	
  structure	
  
Writes	
  index	
  +	
  encoded	
  buffer	
  address	
  to	
  VIC	
  via	
  PIO	
  
VIC	
  reads	
  WQ	
  descriptor	
  
VIC	
  reads	
  buffer	
  from	
  RAM	
  
VIC	
  sends	
  buffer	
  from	
  RAM	
  
WQ	
  
descriptor	
  

Buffer	
  to	
  
send	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Host	
  
Write	
  WQ
	
  in

VIC	
  

dex+addr
	
  
	
  
Read	
  WQ
ket	
  
Read	
  pac

Send	
  on	
  wire	
  

Send	
  ~400ns	
  
sooner	
  

Cisco Public

29
•  Minimize	
  length	
  of	
  priority	
  receive	
  queue	
  
•  Using	
  2048	
  different	
  receive	
  buffers	
  200ns	
  worse	
  than	
  using	
  64	
  
•  Result	
  of	
  IOMMU	
  cache	
  effect	
  
•  We	
  scale	
  length	
  of	
  priority	
  RQ	
  with	
  number	
  of	
  processes	
  in	
  job	
  
Use	
  this	
  much	
  

Userspace	
  
process	
  

Virtual	
  

VIC	
  

Virtual	
  

Intel IO MMU
Physical	
  
Instead	
  of	
  this	
  much	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

30
•  Use	
  fastpaths	
  wherever	
  possible	
  
Be	
  friendly	
  to	
  the	
  opXmizer	
  and	
  instrucXon	
  cache	
  
Made	
  a	
  noXceable	
  difference	
  (!)	
  

if (fastpathable)!
do_it_inline();!
else!
call_slower_path();!

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

31
Machine (128GB)

NUMANode P#0 (64GB)

Socket P#0

PCI 8086:1521

L3 (20MB)

eth0

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#0
PU P#16

MPI	
  processes	
  running	
  on	
  these	
  cores…	
  
PU P#1

PU P#2

PU P#3

PU P#4

PU P#5

PU P#6

PU P#18

PU P#19

PU P#20

PU P#21

PU P#22

PU P#23

eth1

PCI 8086:1521
eth2

PU P#7

PU P#17

PCI 8086:1521

PCI 8086:1521
eth3

PCI 1137:0043
eth4

usnic_0

PCI 1137:0043
eth5

usnic_1

PCI 102b:0522

NUMANode P#1 (64GB)

Socket P#1

PCI 1000:005b

L3 (20MB)

sda

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

eth6

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

usnic_2

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#8

PU P#9

PU P#10

PU P#11

PU P#12

PU P#13

PU P#14

PU P#15

PU P#24

PU P#25

PU P#26

PU P#27

PU P#28

PU P#29

PU P#30

PU P#31

PCI 1137:0043

PCI 1137:0043
eth7

usnic_3

© 2013 Cisco and/or its affiliates. All rights reserved.
Indexes: physical
Date: Thu Nov 7 10:58:23 2013

Cisco Public

32
Machine (128GB)

NUMANode P#0 (64GB)

Socket P#0

PCI 8086:1521

L3 (20MB)

eth0

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#0
PU P#16

MPI	
  processes	
  running	
  on	
  these	
  cores…	
  
PU P#1

PU P#2

PU P#3

PU P#4

PU P#5

PU P#6

PU P#18

PU P#19

PU P#20

PU P#21

PU P#22

PU P#23

eth1

PCI 8086:1521
eth2

PU P#7

PU P#17

PCI 8086:1521

PCI 8086:1521
eth3

PCI 1137:0043
eth4

Only	
  use	
  these	
  usNIC	
  devices	
  
for	
  short	
  messages	
  

usnic_0

PCI 1137:0043
eth5

usnic_1

PCI 102b:0522

NUMANode P#1 (64GB)

Socket P#1

PCI 1000:005b

L3 (20MB)

sda

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

eth6

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

usnic_2

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#8

PU P#9

PU P#10

PU P#11

PU P#12

PU P#13

PU P#14

PU P#15

PU P#24

PU P#25

PU P#26

PU P#27

PU P#28

PU P#29

PU P#30

PU P#31

PCI 1137:0043

PCI 1137:0043
eth7

usnic_3

© 2013 Cisco and/or its affiliates. All rights reserved.
Indexes: physical
Date: Thu Nov 7 10:58:23 2013

Cisco Public

33
Machine (128GB)

NUMANode P#0 (64GB)

Socket P#0

PCI 8086:1521

L3 (20MB)

eth0

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#0
PU P#16

MPI	
  processes	
  running	
  on	
  these	
  cores…	
  
PU P#1

PU P#2

PU P#3

PU P#4

PU P#5

PU P#6

PU P#18

PU P#19

PU P#20

PU P#21

PU P#22

PU P#23

eth1

PCI 8086:1521
eth2

PU P#7

PU P#17

PCI 8086:1521

PCI 8086:1521
eth3

PCI 1137:0043
eth4

Use	
  ALL	
  usNIC	
  devices	
  
for	
  long	
  messages	
  

usnic_0

PCI 1137:0043
eth5

usnic_1

PCI 102b:0522

NUMANode P#1 (64GB)

Socket P#1

PCI 1000:005b

L3 (20MB)

sda

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

L1d (32KB)

eth6

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

L1i (32KB)

usnic_2

Core P#0

Core P#1

Core P#2

Core P#3

Core P#4

Core P#5

Core P#6

Core P#7

PU P#8

PU P#9

PU P#10

PU P#11

PU P#12

PU P#13

PU P#14

PU P#15

PU P#24

PU P#25

PU P#26

PU P#27

PU P#28

PU P#29

PU P#30

PU P#31

PCI 1137:0043

PCI 1137:0043
eth7

usnic_3

© 2013 Cisco and/or its affiliates. All rights reserved.
Indexes: physical
Date: Thu Nov 7 10:58:23 2013

Cisco Public

34
•  Everything	
  above	
  the	
  

firmware	
  is	
  open	
  source	
  

•  Open	
  MPI	
  
DistribuXng	
  Cisco	
  Open	
  MPI	
  1.6.5	
  
Upstream	
  in	
  Open	
  MPI	
  1.7.3	
  

•  Libibverbs	
  plugin	
  
•  Verbs	
  kernel	
  module	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

35
Hardware	
  
•  Cisco	
  UCS	
  C220	
  M3	
  Rack	
  Server	
  	
  
•  Intel	
  E5-­‐2690	
  Processor	
  2.9	
  GHz	
  (3.3	
  GHz	
  Turbo),	
  2	
  Socket,	
  8	
  Cores/Socket	
  
•  1600	
  MHz	
  DDR3	
  Memory,	
  8	
  GB	
  x	
  16,	
  128	
  GB	
  installed	
  
•  Cisco	
  VIC	
  1225	
  with	
  Ultra	
  Low	
  Latency	
  Networking	
  usNIC	
  Driver	
  	
  

•  Cisco	
  Nexus	
  3548	
  
•  48	
  Port	
  10	
  Gbps	
  Ultra	
  Low	
  Latency	
  Ethernet	
  Networking	
  Switch	
  

SoMware	
  
•  OS:	
  Centos	
  6.4,	
  Kernel:	
  2.6.32-­‐358.el6.x86_64	
  (SMP)	
  
•  NetPIPE	
  (ver	
  3.7.1)	
  
•  Intel	
  MPI	
  Benchmarks	
  (ver	
  3.2.4)	
  
•  High	
  Performance	
  Linpack	
  (ver	
  2.1)	
  
•  Other:	
  Intel	
  C	
  Compiler	
  (ver	
  13.0.1),	
  Open	
  MPI	
  (ver	
  1.6.5),	
  Cisco	
  usNIC	
  (1.0.0.7x)	
  

	
  
© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

36
1	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco	
  usNIC	
  Latency	
  
8388611	
  

6291459	
  

4194307	
  

3145731	
  

2097155	
  

1572867	
  

1048579	
  

786435	
  

524291	
  

393219	
  

262147	
  

196611	
  

131075	
  

98307	
  

65539	
  

49155	
  

32771	
  

24579	
  

16387	
  

12291	
  

8195	
  

6147	
  

4099	
  

3075	
  

2051	
  

1539	
  

1027	
  

771	
  

515	
  

387	
  

259	
  

195	
  

131	
  

99	
  

67	
  

51	
  

35	
  

27	
  

19	
  

12	
  

10	
  

4	
  

1	
  

Latency	
  (usecs)	
  
10000	
  
10000	
  

1000	
  
7500	
  

100	
  
5000	
  

2.05	
  usecs	
  latency	
  for	
  small	
  	
  
messages	
  

	
  
Throughput	
  (Mbps)	
  

9.3	
  Gbps	
  Throughput	
  

2500	
  

0	
  

Message	
  Size	
  (bytes)	
  

Cisco	
  usNIC	
  Throughput	
  

Cisco Public
37
PingPing	
  and	
  PingPong	
  Latency	
  track	
  together!	
  

900	
  

100	
  

600	
  

2.05	
  usecs	
  PingPong	
  Latency	
  
2.10	
  usecs	
  PingPing	
  Latency	
  
10	
  

Throughput	
  (MB/s)	
  

1200	
  

1000	
  

Latecny	
  (usecs)	
  

10000	
  

300	
  

1	
  

0	
  
4	
  

16	
  

64	
  

256	
  

1024	
  

4096	
  

16384	
  

65536	
  

262144	
  

1048576	
   4194304	
  

Message	
  Size	
  (bytes)	
  
PingPong	
  ThroughPut	
  (MB/s)	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

PingPing	
  Througput	
  (MB/s)	
  

PingPong	
  Latency	
  (usecs)	
  

PingPing	
  Latency	
  (usecs)	
  

Cisco Public

38
Full	
  Bi-­‐direcBonal	
  Performance	
  for	
  both	
  
Exchange	
  and	
  SendRecv	
  

1800	
  

100	
  

2.11	
  usecs	
  SendRecv	
  Latency	
  
2.58	
  usecs	
  Exchange	
  Latency	
  

1200	
  

10	
  

Throughput	
  (MB/s)	
  

2400	
  

1000	
  

Latecny	
  (usecs)	
  

10000	
  

600	
  

1	
  

0	
  
4	
  

16	
  

64	
  

256	
  

1024	
  

4096	
  

16384	
  

65536	
  

262144	
  

1048576	
   4194304	
  

Message	
  Size	
  (bytes)	
  
SendRecv	
  Throughput	
  (MB/s)	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Exchange	
  Throughput	
  (MB/s)	
  

SendRecv	
  Latency	
  (usecs)	
  

Exchange	
  Latency	
  (usecs)	
  

Cisco Public

39
 
GFLOPS	
  =	
  FLOPS/Cycle	
  x	
  Num	
  CPU	
  Cores	
  x	
  Freq	
  (GHz)	
  
E5-­‐2690	
  Max	
  GFLOPS	
  =	
  8	
  x	
  16	
  x	
  3.3	
  =	
  422	
  GFLOPS	
  
	
  

12500	
  

Single	
  Node	
  HPL	
  Score	
  (16	
  cores):	
  340.51	
  GFLOPS*	
  
32	
  Node	
  HPL	
  Score	
  (512	
  cores):	
  9,773.45	
  GFLOPS	
  
	
  

10000	
  

Efficiency	
  based	
  on	
  Single	
  Machine	
  Score:	
  	
  
	
  (9,773.45)/(340.51	
  x	
  32)	
  x	
  100	
  =	
  89.69%	
  	
  
	
  
	
  

GFlops	
  

7500	
  

5000	
  

2500	
  

0	
  
GFlops	
  

16	
  

32	
  

64	
  

128	
  

256	
  

512	
  

340.51	
  

673.68	
  

1271.14	
  

2647.09	
  

5258.27	
  

9773.45	
  

#	
  of	
  CPU	
  Cores	
  

*	
  Score	
  may	
  improve	
  with	
  addiBonal	
  compiler	
  serngs	
  or	
  newer	
  compiler	
  versions	
  	
  

© 2013 Cisco and/or its affiliates. All rights reserved.

Cisco Public

40
Thank	
  you.	
  

Más contenido relacionado

La actualidad más candente

DPDKによる高速コンテナネットワーキング
DPDKによる高速コンテナネットワーキングDPDKによる高速コンテナネットワーキング
DPDKによる高速コンテナネットワーキングTomoya Hibi
 
IPv4/IPv6 移行・共存技術の動向
IPv4/IPv6 移行・共存技術の動向IPv4/IPv6 移行・共存技術の動向
IPv4/IPv6 移行・共存技術の動向Yuya Rin
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technologySZ Lin
 
545人のインフラを支えたNOCチーム!
545人のインフラを支えたNOCチーム!545人のインフラを支えたNOCチーム!
545人のインフラを支えたNOCチーム!Masayuki Kobayashi
 
VPP事始め
VPP事始めVPP事始め
VPP事始めnpsg
 
Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Sho Shimizu
 
FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料一路 川染
 
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層智啓 出川
 
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)シスコシステムズ合同会社
 
Project calico introduction - OpenStack最新情報セミナー 2017年7月
Project calico introduction - OpenStack最新情報セミナー 2017年7月Project calico introduction - OpenStack最新情報セミナー 2017年7月
Project calico introduction - OpenStack最新情報セミナー 2017年7月VirtualTech Japan Inc.
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)Yasunori Goto
 
クラウド構築 勉強会やったのでまとめました
クラウド構築 勉強会やったのでまとめましたクラウド構築 勉強会やったのでまとめました
クラウド構築 勉強会やったのでまとめましたHiro Mura
 
大規模DCのネットワークデザイン
大規模DCのネットワークデザイン大規模DCのネットワークデザイン
大規模DCのネットワークデザインMasayuki Kobayashi
 
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)Keigo Suda
 
Ethernetの受信処理
Ethernetの受信処理Ethernetの受信処理
Ethernetの受信処理Takuya ASADA
 
FPGAスタートアップ資料
FPGAスタートアップ資料FPGAスタートアップ資料
FPGAスタートアップ資料marsee101
 

La actualidad más candente (20)

DPDKによる高速コンテナネットワーキング
DPDKによる高速コンテナネットワーキングDPDKによる高速コンテナネットワーキング
DPDKによる高速コンテナネットワーキング
 
IPv4/IPv6 移行・共存技術の動向
IPv4/IPv6 移行・共存技術の動向IPv4/IPv6 移行・共存技術の動向
IPv4/IPv6 移行・共存技術の動向
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technology
 
545人のインフラを支えたNOCチーム!
545人のインフラを支えたNOCチーム!545人のインフラを支えたNOCチーム!
545人のインフラを支えたNOCチーム!
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
 
Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像
 
FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料
 
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層
2015年度GPGPU実践基礎工学 第13回 GPUのメモリ階層
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)
Cisco Modeling Labs (CML)を使ってネットワークを学ぼう!(DevNet編)
 
Mmap failure analysis
Mmap failure analysisMmap failure analysis
Mmap failure analysis
 
Project calico introduction - OpenStack最新情報セミナー 2017年7月
Project calico introduction - OpenStack最新情報セミナー 2017年7月Project calico introduction - OpenStack最新情報セミナー 2017年7月
Project calico introduction - OpenStack最新情報セミナー 2017年7月
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
不揮発メモリ(NVDIMM)とLinuxの対応動向について(for comsys 2019 ver.)
 
クラウド構築 勉強会やったのでまとめました
クラウド構築 勉強会やったのでまとめましたクラウド構築 勉強会やったのでまとめました
クラウド構築 勉強会やったのでまとめました
 
大規模DCのネットワークデザイン
大規模DCのネットワークデザイン大規模DCのネットワークデザイン
大規模DCのネットワークデザイン
 
SRv6 study
SRv6 studySRv6 study
SRv6 study
 
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
Apache Kafka & Kafka Connectを に使ったデータ連携パターン(改めETLの実装)
 
Ethernetの受信処理
Ethernetの受信処理Ethernetの受信処理
Ethernetの受信処理
 
FPGAスタートアップ資料
FPGAスタートアップ資料FPGAスタートアップ資料
FPGAスタートアップ資料
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 

Destacado

Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationJeff Squyres
 
Cisco usNIC libfabric provider
Cisco usNIC libfabric providerCisco usNIC libfabric provider
Cisco usNIC libfabric providerJeff Squyres
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualizationHwanju Kim
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to PythonNowell Strite
 

Destacado (6)

Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentation
 
Virtual memory (testing)
Virtual memory (testing)Virtual memory (testing)
Virtual memory (testing)
 
Cisco usNIC libfabric provider
Cisco usNIC libfabric providerCisco usNIC libfabric provider
Cisco usNIC libfabric provider
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualization
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
 

Similar a Cisco Userspace NIC (usNIC) Overview

2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANLdgoodell
 
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017Bruno Teixeira
 
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdf
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdfCisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdf
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdfVarghese Martin
 
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...kds850
 
1 asr9 k platform architecture
1   asr9 k platform architecture1   asr9 k platform architecture
1 asr9 k platform architectureThanh Hung Quach
 
An overview of 100GbE technology, now and the future
An overview of 100GbE technology, now and the futureAn overview of 100GbE technology, now and the future
An overview of 100GbE technology, now and the futureJisc
 
JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!Gene Leyzarovich
 
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)Takao Setaka
 
Cisco UCS (Unified Computing System)
Cisco UCS (Unified Computing System)Cisco UCS (Unified Computing System)
Cisco UCS (Unified Computing System)NetWize
 
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014Bruno Teixeira
 
Building DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNBuilding DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNCisco Canada
 
IPv6 Security - Myths and Reality
IPv6 Security - Myths and RealityIPv6 Security - Myths and Reality
IPv6 Security - Myths and RealitySwiss IPv6 Council
 
cisco-cpak-100ge-lr4=-datasheet.pdf
cisco-cpak-100ge-lr4=-datasheet.pdfcisco-cpak-100ge-lr4=-datasheet.pdf
cisco-cpak-100ge-lr4=-datasheet.pdfHi-Network.com
 
cisco-vs-s720-10g-3cxl-datasheet.pdf
cisco-vs-s720-10g-3cxl-datasheet.pdfcisco-vs-s720-10g-3cxl-datasheet.pdf
cisco-vs-s720-10g-3cxl-datasheet.pdfHi-Network.com
 
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design Considerations
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design ConsiderationsTechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design Considerations
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design ConsiderationsRobb Boyd
 
Развитие решений для коммутации в корпоративных сетях Cisco
Развитие решений для коммутации в корпоративных сетях CiscoРазвитие решений для коммутации в корпоративных сетях Cisco
Развитие решений для коммутации в корпоративных сетях CiscoCisco Russia
 
cisco-cpak-100g-lr4=-datasheet.pdf
cisco-cpak-100g-lr4=-datasheet.pdfcisco-cpak-100g-lr4=-datasheet.pdf
cisco-cpak-100g-lr4=-datasheet.pdfHi-Network.com
 
Cisco Connect Toronto 2017 - UCS and Hyperflex update
Cisco Connect Toronto 2017 - UCS and Hyperflex updateCisco Connect Toronto 2017 - UCS and Hyperflex update
Cisco Connect Toronto 2017 - UCS and Hyperflex updateCisco Canada
 
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...PROIDEA
 
Cisco Connect Toronto 2018 dc-aci-anywhere
Cisco Connect Toronto 2018   dc-aci-anywhereCisco Connect Toronto 2018   dc-aci-anywhere
Cisco Connect Toronto 2018 dc-aci-anywhereCisco Canada
 

Similar a Cisco Userspace NIC (usNIC) Overview (20)

2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL
 
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
 
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdf
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdfCisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdf
Cisco ASR 9000 Architecture - BRKARC-2003 3rd session.pdf
 
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
 
1 asr9 k platform architecture
1   asr9 k platform architecture1   asr9 k platform architecture
1 asr9 k platform architecture
 
An overview of 100GbE technology, now and the future
An overview of 100GbE technology, now and the futureAn overview of 100GbE technology, now and the future
An overview of 100GbE technology, now and the future
 
JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!
 
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)
10G/40G gen to 25G/100G gen, and go forward (HPVI community meetup)
 
Cisco UCS (Unified Computing System)
Cisco UCS (Unified Computing System)Cisco UCS (Unified Computing System)
Cisco UCS (Unified Computing System)
 
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
 
Building DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNBuilding DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPN
 
IPv6 Security - Myths and Reality
IPv6 Security - Myths and RealityIPv6 Security - Myths and Reality
IPv6 Security - Myths and Reality
 
cisco-cpak-100ge-lr4=-datasheet.pdf
cisco-cpak-100ge-lr4=-datasheet.pdfcisco-cpak-100ge-lr4=-datasheet.pdf
cisco-cpak-100ge-lr4=-datasheet.pdf
 
cisco-vs-s720-10g-3cxl-datasheet.pdf
cisco-vs-s720-10g-3cxl-datasheet.pdfcisco-vs-s720-10g-3cxl-datasheet.pdf
cisco-vs-s720-10g-3cxl-datasheet.pdf
 
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design Considerations
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design ConsiderationsTechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design Considerations
TechWiseTV Workshop: Cisco Catalyst 9600: Deep Dive and Design Considerations
 
Развитие решений для коммутации в корпоративных сетях Cisco
Развитие решений для коммутации в корпоративных сетях CiscoРазвитие решений для коммутации в корпоративных сетях Cisco
Развитие решений для коммутации в корпоративных сетях Cisco
 
cisco-cpak-100g-lr4=-datasheet.pdf
cisco-cpak-100g-lr4=-datasheet.pdfcisco-cpak-100g-lr4=-datasheet.pdf
cisco-cpak-100g-lr4=-datasheet.pdf
 
Cisco Connect Toronto 2017 - UCS and Hyperflex update
Cisco Connect Toronto 2017 - UCS and Hyperflex updateCisco Connect Toronto 2017 - UCS and Hyperflex update
Cisco Connect Toronto 2017 - UCS and Hyperflex update
 
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...
PLNOG 13: Krzysztof Konkowski: Cisco Access Architectures: GPON, Ethernet, Ac...
 
Cisco Connect Toronto 2018 dc-aci-anywhere
Cisco Connect Toronto 2018   dc-aci-anywhereCisco Connect Toronto 2018   dc-aci-anywhere
Cisco Connect Toronto 2018 dc-aci-anywhere
 

Más de Jeff Squyres

Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFJeff Squyres
 
MPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumMPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumJeff Squyres
 
MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFJeff Squyres
 
Open MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFOpen MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFJeff Squyres
 
Cisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricCisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricJeff Squyres
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZEJeff Squyres
 
Fun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byFun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byJeff Squyres
 
Open MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapOpen MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapJeff Squyres
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPIJeff Squyres
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedbackJeff Squyres
 
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and EverythingJeff Squyres
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Jeff Squyres
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkJeff Squyres
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizationsJeff Squyres
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsJeff Squyres
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposalJeff Squyres
 
MPI_Mprobe is good for you
MPI_Mprobe is good for youMPI_Mprobe is good for you
MPI_Mprobe is good for youJeff Squyres
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsJeff Squyres
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?Jeff Squyres
 

Más de Jeff Squyres (20)

Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOF
 
MPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumMPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI Forum
 
MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOF
 
Open MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFOpen MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOF
 
Cisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricCisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to Libfabric
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
 
Fun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byFun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-by
 
Open MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapOpen MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmap
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPI
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback
 
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
 
MPI History
MPI HistoryMPI History
MPI History
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talk
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_Requests
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposal
 
MPI_Mprobe is good for you
MPI_Mprobe is good for youMPI_Mprobe is good for you
MPI_Mprobe is good for you
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's Terms
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 

Último

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Cisco Userspace NIC (usNIC) Overview

  • 1. Cisco  Userspace  NIC  (usNIC)   Jeff  Squyres   Cisco  Systems,  Inc.   November  7,  2013   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 1
  • 2. Yes,  we  sell  servers  now   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
  • 3. Record-­‐seNng   Intel  I CS  servers   Cisco  Uvy  Bridge   1U  and  2U  servers   Ultra  low   Cisco  2  x  10Gb  VIC   latency  Ethernet   Yes,   really!   40Gb  top-­‐of-­‐rack   Cisco  10/40Gb   and  core  witches   switches   Nexus  s © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
  • 4. Industry-­‐leading  compute  without  compromise   Rack   HPC  performance   4  socket  +  giant  memory   UCS  C240  M3   Perfect  as  HPC  cluster  head  nodes   or  IO  nodes  (2  socket)   UCS  C420  M3   4-­‐socket  rack  server  for  large-­‐memory   compute  workloads   UCS  C220  M3   Blade   Ideal  for  HPC  compute-­‐intensive   applicaXons  (2  socket)   UCS  B200  M3   Blade  form  factor,  2-­‐socket   © 2013 Cisco and/or its affiliates. All rights reserved. UCS  B420  M3   4-­‐socket  blade  for     large-­‐memory  compute  workloads   Cisco  UCS:    Many  Server  Form  Factors,  One  System   4   Cisco Public 4
  • 5. Worldwide  X86  Server  Blade  Market  Share   UCS  impacBng  growth  of   established  vendors  like  HP   Legacy  offerings  flat-­‐lining   or  in  decline   Cisco  growth  out-­‐pacing  the  market   UCS  #2  and     climbing   Market  AppeXte     for  InnovaXon  Fuels   UCS  Growth   Customers  have  shiMed  19.3%  of   the  global  x86  blade  server  market   to  Cisco  and  over  26%  in  the   Americas  (Source:    IDC  Worldwide  Quarterly   Source:    IDC  Worldwide  Quarterly  Server  Tracker,  Q1  2013  Revenue  Share,  May  2013   Server  Tracker,  Q1  2013  Revenue  Share,  May   2013)   Demand  for  Data  Center  InnovaBon  Has  Vaulted  Cisco  Unified  CompuBng  System     (UCS)  to  the  #2  Leader  in  the  Fast-­‐Growing  Segment  of  the  x86  Server  Market   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
  • 6. 16  world  records   Best  CPU   Performance   Best   VirtualizaXon  &   Cloud   Performance   8  world  records   Best  Database   Performance   Best  Enterprise   ApplicaXon   Performance   Best  Enterprise   Middleware   Performance   Best  HPC   Performance   © 2013 Cisco and/or its affiliates. All rights reserved. 9  world  records   18  world  records   14  world  records   15  world  records   Cisco Public 6
  • 7. One  wire  to  rule  them  all:   •  Commodity  traffic  (e.g.,  ssh)   •  Cluster  /  hardware  management   •  File  system  /  IO  traffic   •  MPI  traffic   10G  or  40G   with  real  QoS   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
  • 8. Low  latency,  high  density  10  /  40Gb  switches   Low  latency   High  density   Nexus  3548   190ns  port-­‐to-­‐port  latency  (L2  and  L3)   Created  for  HPC  /  HFT   48  10Gb  /  12  40Gb  ports   Nexus  6004   1us  port-­‐to-­‐port  latency   384  10Gb  /  96  40Gb  ports   Cisco  Nexus:  Years  of  experience  rolled  into  dependable  soluBons   8   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
  • 9. Spine   Leaf   CharacterisXcs   •  •  •  •  •  •  3  Hops   Low  OversubscripXon  –  Non-­‐Blocking   <  ~3.5  usecs  depending  on  config  and  workload   10G  or  40G  Capable   Spine:  4  to  16  Wide   Leaf:  Determined  by  Spine  Density   Spine  -­‐  Leaf   Port  Scale   Latency   Spines   Leafs   10G  Fabric   6004  -­‐  6001   18,432  x  10G  3:1   ~  3  usecs  Cut-­‐through   16   384   40G  Fabric   6004  -­‐  6004   7,680  x  40G  5:1   ~  3  usecs  Cut-­‐through   16   96   Mixed  Fabric   6004  -­‐  6001   4,680  x  10G  3:1   ~  3  usecs  S&F   4   96   10G  Fabric   6004  -­‐  3548   12,288  x  10G  3:1   ~  1.5  usecs  Cut-­‐through   16   384   40G  Fabric   6004  -­‐  3548   1,152  x  40G  1:1   ~  1.5  usecs  Cut-­‐through   6   96   Mixed  Fabric   6004  -­‐  3548   3,072  x  10G  3:1   ~  1.5  usecs  S&F   4   96   …many  other  configuraBons  are  also  possible   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
  • 10. CharacterisXcs   •  •  •  Spine2   •  •  Spine1   3  Hops  Pod  –  5  hops  DC  east-­‐west  traffic   Low  OversubscripXon  –  Non-­‐Blocking   <  ~3.5  usecs  depending  on  config  and   workload   10G  or  40G  Capable   Two  spine  layers   Leaf   Spine2-­‐Spine1-­‐Leaf   Port  Scale   Latency   Spine2   Spine1   Leafs   10G  Fabric   6004  -­‐  6004  -­‐  6001   55,296  x  10G  3:1   ~  3-­‐5  usecs  Cut-­‐through   48   16  x  6   192   40G  Fabric   6004  -­‐  6004  -­‐  6004   23,040  x  40G  5:1   ~  3-­‐5  usecs  Cut-­‐through   48   16   48   Mixed  Fabric   6004  -­‐  6004  -­‐  6001   18,432  x  10G  3:1   ~  3-­‐5  usecs  S&F   32   4  x  8   48   10G  Fabric   6004  -­‐  6004  -­‐  3548   24,576  x  10G  2:1   ~  1.5-­‐3.5  usecs  Cut-­‐through   32   16  x  4   192   40G  Fabric   6004  -­‐  6004  -­‐  3548   2,304  x  40G  1:1   ~  1.5-­‐3.5  usecs  Cut-­‐through   24   6  x  8   48   Mixed  Fabric   6004  -­‐  6004  -­‐  3548   9,216x  10G  2:1   ~  1.5-­‐3.5  usecs  S&F   24   6  x  8   48   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
  • 11. © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
  • 12. •  Direct  access  to  NIC  hardware  from   Linux  userspace   OperaXng  System  bypass   Via  the  Linux  Verbs  API  (UD)   •  UXlizes  Cisco  Virtual  Interface  Card   (VIC)  for  ultra-­‐low  Ethernet  latency   2nd  generaXon  80Gbps  Cisco  ASIC   2  x  10Gbps  Ethernet  ports   2  x  40Gbps  coming  …soon…   PCI  and  mezzanine  form  factors   •  Half-­‐round  trip  (HRT)  ping-­‐pong   latencies  (Intel  E5-­‐2690  v2  servers):   Raw  back  to  back:  1.57μs   MPI  back  to  back:  1.85μs   Through  MPI+N3548:  2.05μs   © 2013 Cisco and/or its affiliates. All rights reserved. These  numbers   keep  going  down   Cisco Public 12
  • 13. TCP/IP   usNIC   ApplicaXon   Userspace   ApplicaXon   Userspace  sockets   library   Userspace  verbs  library   Kernel   TCP  stack   Bootstrapping   and  setup   General  Ethernet  driver   Verbs  IB  core   Cisco  VIC  driver   Cisco  USNIC   driver   Cisco  VIC  hardware   © 2013 Cisco and/or its affiliates. All rights reserved. Send  and   receive   fast  path   Cisco  VIC  hardware   Cisco Public 13
  • 14. MPI   MPI  directly   injects   L2  frames    to  the  network   © 2013 Cisco and/or its affiliates. All rights reserved. Userspace  verbs  library   Cisco  VIC  hardware   MPI  receives   L2  frames   directly  from   the  VIC   Cisco Public 14
  • 15. MPI  process   MPI  process   x86  Chipset  VT-­‐d   I/O MMU VIC   SR-IOV NIC QP QP Queue pair Classifier   Inbound   L2  frames   © 2013 Cisco and/or its affiliates. All rights reserved. Outbound   L2  frames   Cisco Public 15
  • 16. Physical  FuncXon  (PF)   MAC  address:  aa:bb:cc:dd:ee:ff     QP   QP   VF   QP   QP   VF     VF         VF   Physical  port   © 2013 Cisco and/or its affiliates. All rights reserved. VF   VF   VIC                   Physical  FuncXon  (PF)   MAC  address:  a  a:bb:cc:dd:ee:fe   QP   QP   VF   QP   QP   VF     VF         VF   VF   VF   Physical  port   Cisco Public 16
  • 17. MPI  process   PF  (MAC)     VF   VF   VF     QP   QP     VF   VF   VF     Physical  port   © 2013 Cisco and/or its affiliates. All rights reserved. VIC           PF  (MAC)     VF   VF   VF     QP   QP     VF   VF   VF     Physical  port   Intel IO MMU MPI  process   Cisco Public 17
  • 18. •  Used  for  physical  ßà  virtual  memory  translaXon   •  usnic  verbs  driver  programs  (and  deprograms)  the  IOMMU     VIC   Virtual   Virtual   Intel IO MMU Physical   Virtual   Physical   Userspace   process   RAM   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
  • 19. •  For  the  purposes  of  this  talk,  let’s  assume  that  each  physical  port  has   one  Linux  ethX  device   •  Each  ethX  device  corresponds  to  a  PF   •  Each  usnic_Y  device  corresponds  to  an  ethX  device2   VIC   Physical  port  0   eth4  /  usnic_0   Physical  port  1   eth5  /  usnic_1   Physical  port   Physical  port   (fiber)   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
  • 20. Machine (128GB) NUMANode P#0 (64GB) Socket P#0 PCI 8086:1521 L3 (20MB) eth0 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#0 PU P#1 PU P#2 PU P#3 PU P#4 PU P#5 PU P#6 PU P#7 PU P#16 PU P#17 PU P#18 PU P#19 PU P#20 PU P#21 PU P#22 PU P#23 PCI 8086:1521 eth1 PCI 8086:1521 eth2 PCI 8086:1521 eth3 PCI 1137:0043 eth4 Intel  Xeon  E5-­‐2690  (“Sandy  Bridge”)   2  sockets,  8  cores,  64GB  per  socket   usnic_0 PCI 1137:0043 eth5   VIC   ports   usnic_1 PCI 102b:0522 NUMANode P#1 (64GB) Socket P#1 PCI 1000:005b L3 (20MB) sda L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) eth6 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) usnic_2 Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#8 PU P#9 PU P#10 PU P#11 PU P#12 PU P#13 PU P#14 PU P#15 PU P#24 PU P#25 PU P#26 PU P#27 PU P#28 PU P#29 PU P#30 PU P#31 PCI 1137:0043 PCI 1137:0043 eth7   VIC   ports   usnic_3 Indexes: physical © 2013 Cisco and/or its affiliates. All rights reserved. Date: Thu Nov 7 10:58:23 2013 Cisco Public 20
  • 21. PU P#23 eth3 Machine (128GB) PCI 1137:0043 NUMANode P#0 (64GB) eth4 Socket P#0 L3 (20MB) PCI 8086:1521 eth0 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) usnic_0 L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) PCI 1137:0043 L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 PU P#0 PU P#1 PU P#2 PU P#3 PU P#4 PU P#5 PU P#6 PU P#16 PU P#17 PU P#18 PU P#19 PU P#20 PU P#21 PU P#22 Core P#7 eth5 PCI 8086:1521 eth1 PCI 8086:1521 eth2 PU P#7 PU P#23 usnic_1 PCI 8086:1521 eth3 PCI 1137:0043 PCI 102b:0522 eth4 usnic_0 PCI 1137:0043 eth5 PCI 1000:005b usnic_1 sda PCI 102b:0522 NUMANode P#1 (64GB) L2 (256KB) PCI 1137:0043 Socket P#1 L3 (20MB) L1d (32KB) eth6 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1i (32KB) (32KB) L1d L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) usnic_2 L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 PCI 1137:0043 Core P#7 PU PU P#15 P#8 PU P#9 PU P#10 PU P#11 PU P#12 PU P#13 PU P#14 PU P#25 PU P#26 PU P#27 PU P#28 PU P#29 PU P#30 L1i (32KB) Core P#7 PU P#24 PU P#15 eth7 PCI 1000:005b sda PCI 1137:0043 eth6 usnic_2 PCI 1137:0043 eth7 PU P#31 PU P#31 usnic_3 usnic_3 Indexes: physical © 2013 Cisco and/or its affiliates. All rights reserved. Date: Thu Nov 7 10:58:23 2013 Cisco Public 21
  • 22. ApplicaXon   Open  MPI  layer  (OMPI)   Point-­‐to-­‐point  messaging  layer  (PML)   Byte  Transfer  Layer  (BTL)   OperaXng  System   Hardware   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
  • 23. MPI_Send  /  MPI_Recv  (etc.)   OB1  PML   usnic  BTL   /dev/usnic_0   usnic  BTL   /dev/usnic_1   VIC  0   © 2013 Cisco and/or its affiliates. All rights reserved. usnic  BTL   /dev/usnic_2   usnic  BTL   /dev/usnic_3   VIC  1   Cisco Public 23
  • 24. •  Byte  Transfer  Layer   •  Point-­‐to-­‐point  transfer  plugins  in  OMPI  layer   •  No  protocol  is  assumed  /  required   •  “usnic”  BTL     usnic  BTL   /dev/usnic_2   •  Uses  unreliable  datagram  (UD)  verbs   •  Handles  all  fragmentaXon  and  re-­‐assembly  (vs.  PML)   •  Retransmissions  and  ACKs  handled  in  sovware   •  Sliding  window  retransmission  scheme   •  Direct  inject  /  direct  receive  of  L2  Ethernet  frames   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
  • 25. •  Priority  queue  for  small  and  control  packets   •  Data  queue  for  up  to  MTU-­‐sized  data  packets   Priority   QP   Data   QP   CQ   •  Each  module  has  two  UD  queue  pairs   CQ   •  One  BTL  module  for  each  usNIC  verbs  device   •  Each  QP  has  its  own  CQ   •  QPs  may  or  may  not  be  on  same  VF   •  Overall  BTL  glue  polls  CQs  for  each  device   •  First,  priority  CQs   •  Then  data  CQs   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
  • 26. •  “raw”  latency  (no  MPI,  no  verbs)  is  1.57μs   •  MPI  latency  back-­‐to-­‐back  on  Sandy  Bridge  1.85μs   •  Verbs  responsible  for  about  80ns  of  the  difference  (not  related  to  MPI  API)   •  All  the  rest  of  OMPI  is  only  about  200ns   Raw:  1.57μs   MPI:  200ns   Verbs:  80ns   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
  • 27. •  Deferred  and  piggy-­‐backed  ACKs   Process  A   Msg   Process  B   ACK  N   Msg   Msg   Msg   Immediate   Deferred   Time   ACK  N+2   © 2013 Cisco and/or its affiliates. All rights reserved. Msg   Msg   Msg   Msg+ACK  N+2   Deferred  +   piggybacked   Cisco Public 27
  • 28. •  Host  writes  WQ  structure   Writes  index  to  VIC  via  PIO   VIC  reads  WQ  descriptor   VIC  reads  buffer  from  RAM   VIC  sends  buffer  from  RAM   WQ   descriptor   Host   Write  WQ  in   Read  WQ VIC   dex   ket   Read  pac VIC  now  has   buffer  address   Send  on  wire   Buffer  to   send   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
  • 29. •  Host  writes  WQ  structure   Writes  index  +  encoded  buffer  address  to  VIC  via  PIO   VIC  reads  WQ  descriptor   VIC  reads  buffer  from  RAM   VIC  sends  buffer  from  RAM   WQ   descriptor   Buffer  to   send   © 2013 Cisco and/or its affiliates. All rights reserved. Host   Write  WQ  in VIC   dex+addr     Read  WQ ket   Read  pac Send  on  wire   Send  ~400ns   sooner   Cisco Public 29
  • 30. •  Minimize  length  of  priority  receive  queue   •  Using  2048  different  receive  buffers  200ns  worse  than  using  64   •  Result  of  IOMMU  cache  effect   •  We  scale  length  of  priority  RQ  with  number  of  processes  in  job   Use  this  much   Userspace   process   Virtual   VIC   Virtual   Intel IO MMU Physical   Instead  of  this  much   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
  • 31. •  Use  fastpaths  wherever  possible   Be  friendly  to  the  opXmizer  and  instrucXon  cache   Made  a  noXceable  difference  (!)   if (fastpathable)! do_it_inline();! else! call_slower_path();! © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
  • 32. Machine (128GB) NUMANode P#0 (64GB) Socket P#0 PCI 8086:1521 L3 (20MB) eth0 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#0 PU P#16 MPI  processes  running  on  these  cores…   PU P#1 PU P#2 PU P#3 PU P#4 PU P#5 PU P#6 PU P#18 PU P#19 PU P#20 PU P#21 PU P#22 PU P#23 eth1 PCI 8086:1521 eth2 PU P#7 PU P#17 PCI 8086:1521 PCI 8086:1521 eth3 PCI 1137:0043 eth4 usnic_0 PCI 1137:0043 eth5 usnic_1 PCI 102b:0522 NUMANode P#1 (64GB) Socket P#1 PCI 1000:005b L3 (20MB) sda L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) eth6 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) usnic_2 Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#8 PU P#9 PU P#10 PU P#11 PU P#12 PU P#13 PU P#14 PU P#15 PU P#24 PU P#25 PU P#26 PU P#27 PU P#28 PU P#29 PU P#30 PU P#31 PCI 1137:0043 PCI 1137:0043 eth7 usnic_3 © 2013 Cisco and/or its affiliates. All rights reserved. Indexes: physical Date: Thu Nov 7 10:58:23 2013 Cisco Public 32
  • 33. Machine (128GB) NUMANode P#0 (64GB) Socket P#0 PCI 8086:1521 L3 (20MB) eth0 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#0 PU P#16 MPI  processes  running  on  these  cores…   PU P#1 PU P#2 PU P#3 PU P#4 PU P#5 PU P#6 PU P#18 PU P#19 PU P#20 PU P#21 PU P#22 PU P#23 eth1 PCI 8086:1521 eth2 PU P#7 PU P#17 PCI 8086:1521 PCI 8086:1521 eth3 PCI 1137:0043 eth4 Only  use  these  usNIC  devices   for  short  messages   usnic_0 PCI 1137:0043 eth5 usnic_1 PCI 102b:0522 NUMANode P#1 (64GB) Socket P#1 PCI 1000:005b L3 (20MB) sda L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) eth6 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) usnic_2 Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#8 PU P#9 PU P#10 PU P#11 PU P#12 PU P#13 PU P#14 PU P#15 PU P#24 PU P#25 PU P#26 PU P#27 PU P#28 PU P#29 PU P#30 PU P#31 PCI 1137:0043 PCI 1137:0043 eth7 usnic_3 © 2013 Cisco and/or its affiliates. All rights reserved. Indexes: physical Date: Thu Nov 7 10:58:23 2013 Cisco Public 33
  • 34. Machine (128GB) NUMANode P#0 (64GB) Socket P#0 PCI 8086:1521 L3 (20MB) eth0 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#0 PU P#16 MPI  processes  running  on  these  cores…   PU P#1 PU P#2 PU P#3 PU P#4 PU P#5 PU P#6 PU P#18 PU P#19 PU P#20 PU P#21 PU P#22 PU P#23 eth1 PCI 8086:1521 eth2 PU P#7 PU P#17 PCI 8086:1521 PCI 8086:1521 eth3 PCI 1137:0043 eth4 Use  ALL  usNIC  devices   for  long  messages   usnic_0 PCI 1137:0043 eth5 usnic_1 PCI 102b:0522 NUMANode P#1 (64GB) Socket P#1 PCI 1000:005b L3 (20MB) sda L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB) eth6 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB) usnic_2 Core P#0 Core P#1 Core P#2 Core P#3 Core P#4 Core P#5 Core P#6 Core P#7 PU P#8 PU P#9 PU P#10 PU P#11 PU P#12 PU P#13 PU P#14 PU P#15 PU P#24 PU P#25 PU P#26 PU P#27 PU P#28 PU P#29 PU P#30 PU P#31 PCI 1137:0043 PCI 1137:0043 eth7 usnic_3 © 2013 Cisco and/or its affiliates. All rights reserved. Indexes: physical Date: Thu Nov 7 10:58:23 2013 Cisco Public 34
  • 35. •  Everything  above  the   firmware  is  open  source   •  Open  MPI   DistribuXng  Cisco  Open  MPI  1.6.5   Upstream  in  Open  MPI  1.7.3   •  Libibverbs  plugin   •  Verbs  kernel  module   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
  • 36. Hardware   •  Cisco  UCS  C220  M3  Rack  Server     •  Intel  E5-­‐2690  Processor  2.9  GHz  (3.3  GHz  Turbo),  2  Socket,  8  Cores/Socket   •  1600  MHz  DDR3  Memory,  8  GB  x  16,  128  GB  installed   •  Cisco  VIC  1225  with  Ultra  Low  Latency  Networking  usNIC  Driver     •  Cisco  Nexus  3548   •  48  Port  10  Gbps  Ultra  Low  Latency  Ethernet  Networking  Switch   SoMware   •  OS:  Centos  6.4,  Kernel:  2.6.32-­‐358.el6.x86_64  (SMP)   •  NetPIPE  (ver  3.7.1)   •  Intel  MPI  Benchmarks  (ver  3.2.4)   •  High  Performance  Linpack  (ver  2.1)   •  Other:  Intel  C  Compiler  (ver  13.0.1),  Open  MPI  (ver  1.6.5),  Cisco  usNIC  (1.0.0.7x)     © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
  • 37. 1   © 2013 Cisco and/or its affiliates. All rights reserved. Cisco  usNIC  Latency   8388611   6291459   4194307   3145731   2097155   1572867   1048579   786435   524291   393219   262147   196611   131075   98307   65539   49155   32771   24579   16387   12291   8195   6147   4099   3075   2051   1539   1027   771   515   387   259   195   131   99   67   51   35   27   19   12   10   4   1   Latency  (usecs)   10000   10000   1000   7500   100   5000   2.05  usecs  latency  for  small     messages     Throughput  (Mbps)   9.3  Gbps  Throughput   2500   0   Message  Size  (bytes)   Cisco  usNIC  Throughput   Cisco Public 37
  • 38. PingPing  and  PingPong  Latency  track  together!   900   100   600   2.05  usecs  PingPong  Latency   2.10  usecs  PingPing  Latency   10   Throughput  (MB/s)   1200   1000   Latecny  (usecs)   10000   300   1   0   4   16   64   256   1024   4096   16384   65536   262144   1048576   4194304   Message  Size  (bytes)   PingPong  ThroughPut  (MB/s)   © 2013 Cisco and/or its affiliates. All rights reserved. PingPing  Througput  (MB/s)   PingPong  Latency  (usecs)   PingPing  Latency  (usecs)   Cisco Public 38
  • 39. Full  Bi-­‐direcBonal  Performance  for  both   Exchange  and  SendRecv   1800   100   2.11  usecs  SendRecv  Latency   2.58  usecs  Exchange  Latency   1200   10   Throughput  (MB/s)   2400   1000   Latecny  (usecs)   10000   600   1   0   4   16   64   256   1024   4096   16384   65536   262144   1048576   4194304   Message  Size  (bytes)   SendRecv  Throughput  (MB/s)   © 2013 Cisco and/or its affiliates. All rights reserved. Exchange  Throughput  (MB/s)   SendRecv  Latency  (usecs)   Exchange  Latency  (usecs)   Cisco Public 39
  • 40.   GFLOPS  =  FLOPS/Cycle  x  Num  CPU  Cores  x  Freq  (GHz)   E5-­‐2690  Max  GFLOPS  =  8  x  16  x  3.3  =  422  GFLOPS     12500   Single  Node  HPL  Score  (16  cores):  340.51  GFLOPS*   32  Node  HPL  Score  (512  cores):  9,773.45  GFLOPS     10000   Efficiency  based  on  Single  Machine  Score:      (9,773.45)/(340.51  x  32)  x  100  =  89.69%         GFlops   7500   5000   2500   0   GFlops   16   32   64   128   256   512   340.51   673.68   1271.14   2647.09   5258.27   9773.45   #  of  CPU  Cores   *  Score  may  improve  with  addiBonal  compiler  serngs  or  newer  compiler  versions     © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Public 40