SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
Copyright © NTT Communications Corporation.
Transform your business, transcend expectations with our technologically advanced solutions.
100Gbps OpenStack
For Providing High-Performance NFV
Takeaki Matsumoto
Copyright © NTT Communications Corporation.
1
Agenda
● Background
● Goal / Actions
● Kamuee (Software router)
● DPDK application on OpenStack
● Benchmark
● Conclusion
Copyright © NTT Communications Corporation.
2
Self-Introduction
Takeaki Matsumoto
takeaki.matsumoto@ntt.com
NTT Communications
Technology Development
R&D for OpenStack
Ops for Private Cloud
Copyright © NTT Communications Corporation.
3
Background
● NTT Communications
○ A Global Tier-1 ISP in 196 countries/regions
○ Over 150 datacenters in the world
● Problems
○ Costs
■ spending 1M+ USD for each core router
○ Flexibility
■ long time to add router, orchestration, rollback...
Copyright © NTT Communications Corporation.
4
Goal / Actions
● Goal
○ Cheaper and more flexible router with 100Gbps performamce
● Actions
○ Research & verify software router requirements
○ Check the OpenStack functions for NFV
○ Benchmark the software router performance with OpenStack
Copyright © NTT Communications Corporation.
5
Kamuee
● Software router with 100Gbps+ (on Baremetal)
○ Developed by NTT Communications
○ 146Gbps with 610K+ IPv4 Routes and 128Byte packets
○ Using technologies
■ DPDK
■ Poptrie
■ RCU
○ Achieving 100Gbps Performance at Core with Poptrie and Kamuee Zero
https://www.youtube.com/watch?v=OhHv3O1H8-w
Copyright © NTT Communications Corporation.
6
Requirements
● High-performance NFV requirements
○ High-bandwidth network port
○ Low-latency communication NIC-to-CPU
○ Dedicated CPU cores
○ Hugepages
○ CPU features
Copyright © NTT Communications Corporation.
7
Agenda
● Background
● Goal / Actions
● Kamuee (Software router)
● DPDK application on OpenStack
○ SR-IOV
○ NUMA
○ vCPU pinning
○ Hugepages
○ CPU feature
● Benchmark
● Conslusion
Copyright © NTT Communications Corporation.
Compute Host
NUMA
VM
8
DPDK application on OpenStack
NUMA
vCPU vCPU vCPU vCPU
VF
Memory
hugepage
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
NUMA
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
Copyright © NTT Communications Corporation.
Compute Host
NUMA
VM
9
SR-IOV
NUMA
vCPU vCPU vCPU vCPU
VF
Memory
hugepage
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
NUMA
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
Copyright © NTT Communications Corporation.
10
SR-IOV
● What is SR-IOV?
○ Hardware-level virtualization on supported NIC
○ SR-IOV device has
■ Physical Function (PF)
● Normal NIC device (1 device/physical port)
■ Virtual Funtion (VF)
● Virtual NIC device from PF
● can be created up to NIC's limit
NIC
VF VF VF
VF VF PF
Copyright © NTT Communications Corporation.
NIC
11
SR-IOV
● Why need SR-IOV?
○ vSwitch can be bottleneck on high-performance network
○ SR-IOV has no effect on Host CPU
VF
VF
VM
VF
VF
VF
PF
vNIC
Software vSwitch
VM
NIC
VF
VF
VF
PF
VF
VF
VF
PCI Passthrough
Typical Implementation SR-IOV
Copyright © NTT Communications Corporation.
12
SR-IOV
● OpenStack supports SR-IOV
○ VF can be used as Neutron port
○ Instance get VF directly with PCI-Passthrough
$ neutron port-create $net_id --name sriov_port --binding:vnic_type direct
$ openstack server create --flavor m1.large --image ubuntu_14.04 --nic
port-id=$port_id sriov-server
ubuntu@sriov-server $ lspci | grep Ethernet
00:05.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4
Virtual Function]
Copyright © NTT Communications Corporation.
Compute Host
NUMA
VM
13
NUMA
NUMA
vCPU vCPU vCPU vCPU
VF
Memory
hugepage
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
NUMA
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
Copyright © NTT Communications Corporation.
14
NUMA
● What is NUMA?
○ Non-Uniform Memory Access
○ Server usually has multi NUMA nodes on each CPU socket
○ CPU cores, Memory, PCI devices belong to its NUMA nodes
○ For low-latency, we have to think about NUMA Topology
NUMA
Socket
NIC Memory
CPU CPU CPU
CPU CPU
NUMA
Socket
NIC Memory
CPU CPU CPU
CPU CPU
Interconnect
has overhead
Copyright © NTT Communications Corporation.
15
NUMA
● OpenStack has NUMATopologyFilter
○ can schedule VM with thinking about NUMA topology
○ When using hugepages or CPU-pinning,
automatically launch on same NUMA node
○ 2 NUMA nodes also can be used
$ openstack flavor set m1.large --property hw:numa_nodes=1
$ openstack flavor set m1.large --property hw:numa_nodes=2
Copyright © NTT Communications Corporation.
Compute Host
NUMA
VM
16
vCPU pinning
NUMA
vCPU vCPU vCPU vCPU
VF
Memory
hugepage
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
NUMA
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
Copyright © NTT Communications Corporation.
● What is vCPU pinning?
○ vCPU:pCPU=1:1 dedicated allocation
■ Reduces context-switching
17
vCPU pinning
pCPU
vCPU
pCPU
vCPU
pCPU
vCPU
pCPU
vCPU
pCPU
nova-compute
Linux process
Dedicated for vCPUs
Copyright © NTT Communications Corporation.
● OpenStack flavor has extra spec "hw:cpu_policy"
○ enables vCPU pinning
18
vCPU pinning
$ openstack flavor set m1.large --property hw:cpu_policy=dedicated
$ virsh vcpupin instance-00000002
VCPU: CPU Affinity
----------------------------------
0: 1
1: 2
2: 3
3: 4
4: 5
5: 6
6: 7
7: 8
8: 9
$ virsh vcpupin instance-00000001
VCPU: CPU Affinity
----------------------------------
0: 0-31
1: 0-31
2: 0-31
3: 0-31
4: 0-31
5: 0-31
6: 0-31
7: 0-31
8: 0-31
Default allocation vCPU pinning
Copyright © NTT Communications Corporation.
Compute Host
NUMA
VM
19
Hugepages
NUMA
vCPU vCPU vCPU vCPU
VF
Memory
hugepage
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
NUMA
CPU CPU CPU CPU CPU
NICVF VF Memory
hugepage
Copyright © NTT Communications Corporation.
● What is Hugepages?
○ segmented pages in memory from 4KB to larger size
■ Reduces TLB misses
■ DPDK applications usually use Hugepages
20
Hugepages
page
page
page
page
virtual
virtual
virtual
virtual
physical
physical
physical
physical
page
page
page
page
Virtual address TLB Physical address
Page table
TLB miss
Copyright © NTT Communications Corporation.
● OpenStack flavor has extra spec "hw:mem_page_size"
○ Enables Hugepages and assign to guest
21
Hugepages
$ openstack flavor set m1.large --property hw:mem_page_size=1048576
$ cat /etc/libvirt/qemu/instance-00000002.xml | grep hugepages -1
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB' />
</hugepages>
</memoryBacking>
$ cat /proc/meminfo | grep Hugepagesize
Hugepagesize: 1048576 kB
Copyright © NTT Communications Corporation.
● Optimization feature for DPDK
○ SSSE3, SSE4,...
● "[libvirt] cpu_mode" option in nova.conf
○ By default, none is set in some distribution
○ host-model, host-passthrough, or custom is required
22
Other CPU features
$ cat /proc/cpuinfo | grep -e model name -e flags
model name : Intel Core Processor (Broadwell)
flags : fpu vme de pse tsc msr pae mce cx8 apic
sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse
sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc
rep_good nopl xtopology eagerfpu pni pclmulqdq vmx
ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor
lahf_lm abm 3dnowprefetch tpr_shadow vnmi flexpriority
ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2
erms invpcid rtm rdseed adx smap xsaveopt arat
$ cat /proc/cpuinfo | grep -e model name -e flags
model name : QEMU Virtual CPU version 2.0.0
flags : fpu de pse tsc msr pae mce cx8 apic sep
mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2
syscall nx lm rep_good nopl pni vmx cx16 x2apic popcnt
hypervisor lahf_lm vnmi ept
cpu_mode=none cpu_mode=host-model
Copyright © NTT Communications Corporation.
23
Agenda
● Background
● Goal / Actions
● Kamuee (Software router)
● DPDK application on OpenStack
● Benchmark
○ Environment
○ Baremetal performance
○ VM + VF performance
○ VM +PF performance
○ Baremetal (VF exists) performance
Copyright © NTT Communications Corporation.
24
Environment: Hardware
● Server
○ Dell PowerEdge R630
■ Intel® Xeon® CPU E5-2690 v4 @ 2.60GHz (14 cores) * 2
■ DDR-4 256GB (32GB * 8)
■ Ubuntu 16.04
● NIC
○ Mellanox ConnectX-4 100Gb/s Dual-Port Adapter
■ 1 PCIe Card, 100G Ports * 2
● Switch
○ Dell Networking Z9100
■ Cumulus Linux 3.2.0
■ 100Gbps Port * 32
Copyright © NTT Communications Corporation.
25
Environment: Architecture
②
③
Switch
Kamuee
pktgen
dpdk 0 VLAN100
※Each line is 100G link
ConnectX-4 100G 2 port
(using 2ports)
ConnectX-4 100G 2 port * 2
(using only each 1port)
NIC
port0
port1
pktgen
dpdk 1
NIC
port0
port1
NIC
0
port0
port1
NIC
1
port0
port1
①
②
③
①
VLAN200
nexthop 0NIC
nexthop 1
port0
port1
NIC
port0
port1
④
④
Copyright © NTT Communications Corporation.
26
Environment: pktgen-dpdk
● Open source packet generator
○ Output: about 100Mpps≒67.2Gbps/server (64Byte packet)
■ 50Mpps/port
○ dst mac
■ kamuee NIC0 port0 (port0-1 on pktgen-dpdk 0)
■ kamuee NIC1 port0 (port0-1 on pktgen-dpdk 1)
○ dst ip (range)
■ 1.0.0.1-254 (port0 on each server)
■ 1.0.4.1-254 (port1 on each server)
○ dst TCP port (range)
■ 1-254 (port0 on each server)
■ 256-510 (port1 on each server)
Copyright © NTT Communications Corporation.
27
Environment: Kamuee
● DPDK software router
● Spec configuration
○ 2 NUMA nodes
○ Using 26 cores
■ Forwarding: 12 cores/port * 2 (each NUMA)
■ Other functions: 2 cores
○ Using 16GB memory
■ 1GB Hugepages * 8 * 2 (each NUMA)
○ 2 NICs
■ only port 0 is used * 2 (each NUMA)
Copyright © NTT Communications Corporation.
28
Environment: Kamuee
● Routing configuration
○ 518K routes (like Fullroute) loaded
■ Forwading to nexthop server
● DPDK EAL options
○ ./kamuee -n 4 --socket-mem 8192,8192 -w
0000:00:05.0,txq_inline=128 -w 0000:00:06.0,txq_inline=128
kamuee-console> show ipv4 route
1.0.0.0/24 nexthop: 172.21.4.105
1.0.4.0/24 nexthop: 172.21.3.104
...
kamuee-console> show ipv4 route 172.21.4.105
172.21.4.105/32 ether: 24:8a:07:4c:2f:64 port 1
kamuee-console> show ipv4 route 172.21.3.104
172.21.3.104/32 ether: 24:8a:07:4c:2f:6c port 0
Copyright © NTT Communications Corporation.
29
Environment: nexthop
● Measuring RX packets
○ Using eth_stat.sh
■ https://community.mellanox.com/docs/DOC-2506#jive_content_id_
How_to_Measure_Ingress_Rate
■ using "rx_packets_phy" on ethtool
● hardware-level packet counter
Copyright © NTT Communications Corporation.
30
Environment: Ideal flow on each pktgen server (64Byte)
③:33.6Gbps
Switch
Kamuee
pktgen
dpdk 0 VLAN100
ConnectX-4 100G 2 port
(using 2ports)
ConnectX-4 100G 2 port * 2
(using only each 1port)
NIC
port0
port1
pktgen
dpdk 1
NIC
port0
port1
NIC
0
port0
port1
NIC
1
port0
port1
VLAN200
nexthop 0NIC
nexthop 1
port0
port1
NIC
port0
port1
①:33.6Gbps ②:67.2Gbps
③:33.6Gbps
③:33.6Gbps
③:33.6Gbps
①:33.6Gbps
Copyright © NTT Communications Corporation.
31
Environment: Ideal flow on each pktgen server (64Byte)
③:33.6Gbps
Switch
Kamuee
pktgen
dpdk 0 VLAN100
ConnectX-4 100G 2 port
(using 2ports)
ConnectX-4 100G 2 port * 2
(using only each 1port)
NIC
port0
port1
pktgen
dpdk 1
NIC
port0
port1
NIC
0
port0
port1
NIC
1
port0
port1
VLAN200
nexthop 0NIC
nexthop 1
port0
port1
NIC
port0
port1
①:33.6Gbps
②:67.2Gbps
③:33.6Gbps
③:33.6Gbps
③:33.6Gbps
①:33.6Gbps
Copyright © NTT Communications Corporation.
32
Environment: Ideal flow (64Byte)
Switch
Kamuee
pktgen
dpdk 0 VLAN100
ConnectX-4 100G 2 port
(using 2ports)
ConnectX-4 100G 2 port * 2
(using only each 1port)
NIC
port0
port1
pktgen
dpdk 1
NIC
port0
port1
NIC
0
port0
port1
NIC
1
port0
port1
VLAN200
nexthop 0NIC
nexthop 1
port0
port1
NIC
port0
port1
①:33.6Gbps ②:67.2Gbps
③:67.2Gbps
③:67.2Gbps
③:67.2Gbps
①:33.6Gbps
①:33.6Gbps
①:33.6Gbps
③:67.2Gbps
②:67.2Gbps
Copyright © NTT Communications Corporation.
33
Baremetal performance: Configuration
● BIOS
○ Hyper-Threading: OFF
● Boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable
nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup
default_hugepagesz=1G hugepagesz=1G hugepages=32
isolcpus=1-27
● Mellanox
○ CQE_COMPRESSION: AGGRESSIVE(1)
○ SRIOV_EN: False(0)
● Ports
○ 2 PFs (only port0 on each NIC)
Copyright © NTT Communications Corporation.
34
Baremetal performance: Result
Copyright © NTT Communications Corporation.
35
VM + VF performance: Host Configuration
● BIOS
○ Hyper-Threading: OFF
○ VT-d: ON
● Host boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable
nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup
default_hugepagesz=1G hugepagesz=1G hugepages=32
isolcpus=1-27 intel_iommu=on
● Mellanox
○ CQE_COMPRESSION: AGGRESSIVE(1)
○ SRIOV_EN: True(1)
○ NUM_OF_VFS: 1
Copyright © NTT Communications Corporation.
36
VM + VF performance: Guest Configuration
● Flavor
○ vCPUs: 27
○ Memory: 32GB
○ extra_specs:
■ hw:cpu_policy: dedicated
■ hw:mem_page_size: 1048576
■ hw:numa_mem.0: 16384
■ hw:numa_mem.1: 16384
■ hw:numa_cpus.0: 0-13
■ hw:numa_cpus.1: 14-26
■ hw:numa_nodes: 2
● Ports
○ 2 VFs (vf 0 on each NIC port0)
● Guest boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-26 rcu_nocbs=1-26 rcu_novb_poll audit=0
nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-26
Copyright © NTT Communications Corporation.
37
VM + VF performance: Result
Copyright © NTT Communications Corporation.
38
VM + PF performance: Host Configuration
● BIOS
○ Hyper-Threading: OFF
○ VT-d: ON
● Host boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable
nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup
default_hugepagesz=1G hugepagesz=1G hugepages=32
isolcpus=1-27 intel_iommu=on
● Mellanox
○ CQE_COMPRESSION: AGGRESSIVE(1)
○ SRIOV_EN: False(0)
Copyright © NTT Communications Corporation.
39
VM + PF performance: Guest Configuration
● Flavor
○ vCPUs: 27
○ Memory: 32GB
○ extra_specs:
■ hw:cpu_policy: dedicated
■ hw:mem_page_size: 1048576
■ hw:numa_mem.0: 16384
■ hw:numa_mem.1: 16384
■ hw:numa_cpus.0: 0-13
■ hw:numa_cpus.1: 14-26
■ hw:numa_nodes: 2
● Ports
○ 2 PFs (only port0 on each NIC with PCI-Passthrough)
● Guest boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-26 rcu_nocbs=1-26 rcu_novb_poll audit=0
nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-26
Copyright © NTT Communications Corporation.
40
VM + PF performance: Result
Copyright © NTT Communications Corporation.
41
Baremetal (VF exists) performance: Configuration
● BIOS
○ Hyper-Threading: OFF
● Boot parameters
○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-27 rcu_nocbs=1-27
rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=32
isolcpus=1-27
● Mellanox
○ CQE_COMPRESSION: AGGRESSIVE(1)
○ SRIOV_EN: True(1)
○ NUM_OF_VFS: 1
● Ports
○ 2 PFs (only port0 on each NIC)
○ 2 VFs (vf 0 on each NIC port0) exists [not used]
Copyright © NTT Communications Corporation.
42
Baremetal (VF exists) performance: Result
Copyright © NTT Communications Corporation.
43
All Results
Copyright © NTT Communications Corporation.
44
Conclusion
● OpenStack functions for NFV works fine
○ SR-IOV port assignment
○ NUMA awareness
○ vCPU pinning
○ Hugepages
○ CPU Feature
● KVM + Intel VT archive close to baremetal performance
● SR-IOV performance evaluation is required
○ SR-IOV device implementation depends on its vendor
Copyright © NTT Communications Corporation.
45
Conclusion
● Our decision
○ VM + PF is powerful option
■ SR-IOV advantange
● Multiple VF can be created
○ Router
○ Firewall
○ Load balancer
○ ...
■ 100G router consumes almost host resources
● "1 Host: 1 VM" is realistic option
○ no need so many ports
Copyright © NTT Communications Corporation.
46
Thank you!
Copyright © NTT Communications Corporation.
47
References
● SR-IOV
○ https://docs.openstack.org/ocata/networking-guide/config-sriov.html
● How to enable SR-IOV with Mellanox NIC
○ https://community.mellanox.com/docs/DOC-2386
● Hugepages
○ https://www.mirantis.com/blog/mirantis-openstack-7-0-nfvi-deployment-gui
de-huge-pages/
● isolcpu & cpupinning
○ https://docs.mirantis.com/mcp/1.0/mcp-deployment-guide/enable-numa-a
nd-cpu-pinning/enable-numa-and-cpu-pinning-procedure.html
● NUMA
○ https://docs.openstack.org/nova/pike/admin/cpu-topologies.html

Más contenido relacionado

La actualidad más candente

How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on Linux
Etsuji Nakai
 
OpenStack networking
OpenStack networkingOpenStack networking
OpenStack networking
Sim Janghoon
 
MP BGP-EVPN 실전기술-1편(개념잡기)
MP BGP-EVPN 실전기술-1편(개념잡기)MP BGP-EVPN 실전기술-1편(개념잡기)
MP BGP-EVPN 실전기술-1편(개념잡기)
JuHwan Lee
 

La actualidad más candente (20)

ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観
 
Topology Managerについて / Kubernetes Meetup Tokyo 50
Topology Managerについて / Kubernetes Meetup Tokyo 50Topology Managerについて / Kubernetes Meetup Tokyo 50
Topology Managerについて / Kubernetes Meetup Tokyo 50
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Ifupdown2: Network Interface Manager
Ifupdown2: Network Interface ManagerIfupdown2: Network Interface Manager
Ifupdown2: Network Interface Manager
 
How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on Linux
 
L2/L3 für Fortgeschrittene - Helle und dunkle Magie im Linux-Netzwerkstack
L2/L3 für Fortgeschrittene - Helle und dunkle Magie im Linux-NetzwerkstackL2/L3 für Fortgeschrittene - Helle und dunkle Magie im Linux-Netzwerkstack
L2/L3 für Fortgeschrittene - Helle und dunkle Magie im Linux-Netzwerkstack
 
SRv6 study
SRv6 studySRv6 study
SRv6 study
 
Deploying IPv6 in OpenStack Environments
Deploying IPv6 in OpenStack EnvironmentsDeploying IPv6 in OpenStack Environments
Deploying IPv6 in OpenStack Environments
 
Ovs dpdk hwoffload way to full offload
Ovs dpdk hwoffload way to full offloadOvs dpdk hwoffload way to full offload
Ovs dpdk hwoffload way to full offload
 
DNSキャッシュサーバ チューニングの勘所
DNSキャッシュサーバ チューニングの勘所DNSキャッシュサーバ チューニングの勘所
DNSキャッシュサーバ チューニングの勘所
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조
 
OpenStack networking
OpenStack networkingOpenStack networking
OpenStack networking
 
フロー技術によるネットワーク管理
フロー技術によるネットワーク管理フロー技術によるネットワーク管理
フロー技術によるネットワーク管理
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
 
OpenStackで始めるクラウド環境構築入門(Horizon 基礎編)
OpenStackで始めるクラウド環境構築入門(Horizon 基礎編)OpenStackで始めるクラウド環境構築入門(Horizon 基礎編)
OpenStackで始めるクラウド環境構築入門(Horizon 基礎編)
 
DRBD/Heartbeat/Pacemakerで作るKVM仮想化クラスタ
DRBD/Heartbeat/Pacemakerで作るKVM仮想化クラスタDRBD/Heartbeat/Pacemakerで作るKVM仮想化クラスタ
DRBD/Heartbeat/Pacemakerで作るKVM仮想化クラスタ
 
Under the Hood: Open vSwitch & OpenFlow in XCP & XenServer
Under the Hood: Open vSwitch & OpenFlow in XCP & XenServerUnder the Hood: Open vSwitch & OpenFlow in XCP & XenServer
Under the Hood: Open vSwitch & OpenFlow in XCP & XenServer
 
AS45679 on FreeBSD
AS45679 on FreeBSDAS45679 on FreeBSD
AS45679 on FreeBSD
 
MP BGP-EVPN 실전기술-1편(개념잡기)
MP BGP-EVPN 실전기술-1편(개념잡기)MP BGP-EVPN 실전기술-1편(개념잡기)
MP BGP-EVPN 실전기술-1편(개념잡기)
 

Similar a 100Gbps OpenStack For Providing High-Performance NFV

OSS-10mins-7th2.pptx
OSS-10mins-7th2.pptxOSS-10mins-7th2.pptx
OSS-10mins-7th2.pptx
jagmohan33
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
aaajjj4
 

Similar a 100Gbps OpenStack For Providing High-Performance NFV (20)

Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data Plane
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
 
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
 
XS Boston 2008 Network Topology
XS Boston 2008 Network TopologyXS Boston 2008 Network Topology
XS Boston 2008 Network Topology
 
OSS-10mins-7th2.pptx
OSS-10mins-7th2.pptxOSS-10mins-7th2.pptx
OSS-10mins-7th2.pptx
 
Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstack
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRC
 
Building your own CGN boxes with Linux
Building your own CGN boxes with LinuxBuilding your own CGN boxes with Linux
Building your own CGN boxes with Linux
 
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitchDPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Benefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based NodesBenefits of Multi-rail Cluster Architectures for GPU-based Nodes
Benefits of Multi-rail Cluster Architectures for GPU-based Nodes
 
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdfStorage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
Storage-Performance-Tuning-for-FAST-Virtual-Machines_Fam-Zheng.pdf
 
DockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaDockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing Aurea
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to O...
Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to O...Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to O...
Can we boost more HPC performance? Integrate IBM POWER servers with GPUs to O...
 

Más de NTT Communications Technology Development

Más de NTT Communications Technology Development (20)

クラウドを最大限活用するinfrastructure as codeを考えよう
クラウドを最大限活用するinfrastructure as codeを考えようクラウドを最大限活用するinfrastructure as codeを考えよう
クラウドを最大限活用するinfrastructure as codeを考えよう
 
【たぶん日本初導入!】Azure Stack Hub with GPUの性能と機能紹介
【たぶん日本初導入!】Azure Stack Hub with GPUの性能と機能紹介【たぶん日本初導入!】Azure Stack Hub with GPUの性能と機能紹介
【たぶん日本初導入!】Azure Stack Hub with GPUの性能と機能紹介
 
macOSの仮想化技術について ~Virtualization-rs Rust bindings for virtualization.framework ~
macOSの仮想化技術について ~Virtualization-rs Rust bindings for virtualization.framework ~macOSの仮想化技術について ~Virtualization-rs Rust bindings for virtualization.framework ~
macOSの仮想化技術について ~Virtualization-rs Rust bindings for virtualization.framework ~
 
マルチクラウドでContinuous Deliveryを実現するSpinnakerについて
マルチクラウドでContinuous Deliveryを実現するSpinnakerについて マルチクラウドでContinuous Deliveryを実現するSpinnakerについて
マルチクラウドでContinuous Deliveryを実現するSpinnakerについて
 
Argo CDについて
Argo CDについてArgo CDについて
Argo CDについて
 
SpinnakerとKayentaで 高速・安全なデプロイ!
SpinnakerとKayentaで 高速・安全なデプロイ!SpinnakerとKayentaで 高速・安全なデプロイ!
SpinnakerとKayentaで 高速・安全なデプロイ!
 
AWS re:Invent2017で見た AWSの強さとは
AWS re:Invent2017で見た AWSの強さとは AWS re:Invent2017で見た AWSの強さとは
AWS re:Invent2017で見た AWSの強さとは
 
分散トレーシング技術について(Open tracingやjaeger)
分散トレーシング技術について(Open tracingやjaeger)分散トレーシング技術について(Open tracingやjaeger)
分散トレーシング技術について(Open tracingやjaeger)
 
Mexico ops meetup発表資料 20170905
Mexico ops meetup発表資料 20170905Mexico ops meetup発表資料 20170905
Mexico ops meetup発表資料 20170905
 
NTT Tech Conference #2 - closing -
NTT Tech Conference #2 - closing -NTT Tech Conference #2 - closing -
NTT Tech Conference #2 - closing -
 
イケてない開発チームがイケてる開発を始めようとする軌跡
イケてない開発チームがイケてる開発を始めようとする軌跡イケてない開発チームがイケてる開発を始めようとする軌跡
イケてない開発チームがイケてる開発を始めようとする軌跡
 
GPU Container as a Service を実現するための最新OSS徹底比較
GPU Container as a Service を実現するための最新OSS徹底比較GPU Container as a Service を実現するための最新OSS徹底比較
GPU Container as a Service を実現するための最新OSS徹底比較
 
SpinnakerとOpenStackの構築
SpinnakerとOpenStackの構築SpinnakerとOpenStackの構築
SpinnakerとOpenStackの構築
 
Troveコミュニティ動向
Troveコミュニティ動向Troveコミュニティ動向
Troveコミュニティ動向
 
Web rtc for iot, edge computing use cases
Web rtc for iot, edge computing use casesWeb rtc for iot, edge computing use cases
Web rtc for iot, edge computing use cases
 
OpenStack Ops Mid-Cycle Meetup & Project Team Gathering出張報告
OpenStack Ops Mid-Cycle Meetup & Project Team Gathering出張報告OpenStack Ops Mid-Cycle Meetup & Project Team Gathering出張報告
OpenStack Ops Mid-Cycle Meetup & Project Team Gathering出張報告
 
NTT Tech Conference #1 Opening Keynote
NTT Tech Conference #1 Opening KeynoteNTT Tech Conference #1 Opening Keynote
NTT Tech Conference #1 Opening Keynote
 
NTT Tech Conference #1 Closing Keynote
NTT Tech Conference #1 Closing KeynoteNTT Tech Conference #1 Closing Keynote
NTT Tech Conference #1 Closing Keynote
 
OpsからみたOpenStack Summit
OpsからみたOpenStack SummitOpsからみたOpenStack Summit
OpsからみたOpenStack Summit
 
RabbitMQ can scale out!!(jp ops-workshop-3)
RabbitMQ can scale out!!(jp ops-workshop-3)RabbitMQ can scale out!!(jp ops-workshop-3)
RabbitMQ can scale out!!(jp ops-workshop-3)
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

100Gbps OpenStack For Providing High-Performance NFV

  • 1. Copyright © NTT Communications Corporation. Transform your business, transcend expectations with our technologically advanced solutions. 100Gbps OpenStack For Providing High-Performance NFV Takeaki Matsumoto
  • 2. Copyright © NTT Communications Corporation. 1 Agenda ● Background ● Goal / Actions ● Kamuee (Software router) ● DPDK application on OpenStack ● Benchmark ● Conclusion
  • 3. Copyright © NTT Communications Corporation. 2 Self-Introduction Takeaki Matsumoto takeaki.matsumoto@ntt.com NTT Communications Technology Development R&D for OpenStack Ops for Private Cloud
  • 4. Copyright © NTT Communications Corporation. 3 Background ● NTT Communications ○ A Global Tier-1 ISP in 196 countries/regions ○ Over 150 datacenters in the world ● Problems ○ Costs ■ spending 1M+ USD for each core router ○ Flexibility ■ long time to add router, orchestration, rollback...
  • 5. Copyright © NTT Communications Corporation. 4 Goal / Actions ● Goal ○ Cheaper and more flexible router with 100Gbps performamce ● Actions ○ Research & verify software router requirements ○ Check the OpenStack functions for NFV ○ Benchmark the software router performance with OpenStack
  • 6. Copyright © NTT Communications Corporation. 5 Kamuee ● Software router with 100Gbps+ (on Baremetal) ○ Developed by NTT Communications ○ 146Gbps with 610K+ IPv4 Routes and 128Byte packets ○ Using technologies ■ DPDK ■ Poptrie ■ RCU ○ Achieving 100Gbps Performance at Core with Poptrie and Kamuee Zero https://www.youtube.com/watch?v=OhHv3O1H8-w
  • 7. Copyright © NTT Communications Corporation. 6 Requirements ● High-performance NFV requirements ○ High-bandwidth network port ○ Low-latency communication NIC-to-CPU ○ Dedicated CPU cores ○ Hugepages ○ CPU features
  • 8. Copyright © NTT Communications Corporation. 7 Agenda ● Background ● Goal / Actions ● Kamuee (Software router) ● DPDK application on OpenStack ○ SR-IOV ○ NUMA ○ vCPU pinning ○ Hugepages ○ CPU feature ● Benchmark ● Conslusion
  • 9. Copyright © NTT Communications Corporation. Compute Host NUMA VM 8 DPDK application on OpenStack NUMA vCPU vCPU vCPU vCPU VF Memory hugepage CPU CPU CPU CPU CPU NICVF VF Memory hugepage NUMA CPU CPU CPU CPU CPU NICVF VF Memory hugepage
  • 10. Copyright © NTT Communications Corporation. Compute Host NUMA VM 9 SR-IOV NUMA vCPU vCPU vCPU vCPU VF Memory hugepage CPU CPU CPU CPU CPU NICVF VF Memory hugepage NUMA CPU CPU CPU CPU CPU NICVF VF Memory hugepage
  • 11. Copyright © NTT Communications Corporation. 10 SR-IOV ● What is SR-IOV? ○ Hardware-level virtualization on supported NIC ○ SR-IOV device has ■ Physical Function (PF) ● Normal NIC device (1 device/physical port) ■ Virtual Funtion (VF) ● Virtual NIC device from PF ● can be created up to NIC's limit NIC VF VF VF VF VF PF
  • 12. Copyright © NTT Communications Corporation. NIC 11 SR-IOV ● Why need SR-IOV? ○ vSwitch can be bottleneck on high-performance network ○ SR-IOV has no effect on Host CPU VF VF VM VF VF VF PF vNIC Software vSwitch VM NIC VF VF VF PF VF VF VF PCI Passthrough Typical Implementation SR-IOV
  • 13. Copyright © NTT Communications Corporation. 12 SR-IOV ● OpenStack supports SR-IOV ○ VF can be used as Neutron port ○ Instance get VF directly with PCI-Passthrough $ neutron port-create $net_id --name sriov_port --binding:vnic_type direct $ openstack server create --flavor m1.large --image ubuntu_14.04 --nic port-id=$port_id sriov-server ubuntu@sriov-server $ lspci | grep Ethernet 00:05.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
  • 14. Copyright © NTT Communications Corporation. Compute Host NUMA VM 13 NUMA NUMA vCPU vCPU vCPU vCPU VF Memory hugepage CPU CPU CPU CPU CPU NICVF VF Memory hugepage NUMA CPU CPU CPU CPU CPU NICVF VF Memory hugepage
  • 15. Copyright © NTT Communications Corporation. 14 NUMA ● What is NUMA? ○ Non-Uniform Memory Access ○ Server usually has multi NUMA nodes on each CPU socket ○ CPU cores, Memory, PCI devices belong to its NUMA nodes ○ For low-latency, we have to think about NUMA Topology NUMA Socket NIC Memory CPU CPU CPU CPU CPU NUMA Socket NIC Memory CPU CPU CPU CPU CPU Interconnect has overhead
  • 16. Copyright © NTT Communications Corporation. 15 NUMA ● OpenStack has NUMATopologyFilter ○ can schedule VM with thinking about NUMA topology ○ When using hugepages or CPU-pinning, automatically launch on same NUMA node ○ 2 NUMA nodes also can be used $ openstack flavor set m1.large --property hw:numa_nodes=1 $ openstack flavor set m1.large --property hw:numa_nodes=2
  • 17. Copyright © NTT Communications Corporation. Compute Host NUMA VM 16 vCPU pinning NUMA vCPU vCPU vCPU vCPU VF Memory hugepage CPU CPU CPU CPU CPU NICVF VF Memory hugepage NUMA CPU CPU CPU CPU CPU NICVF VF Memory hugepage
  • 18. Copyright © NTT Communications Corporation. ● What is vCPU pinning? ○ vCPU:pCPU=1:1 dedicated allocation ■ Reduces context-switching 17 vCPU pinning pCPU vCPU pCPU vCPU pCPU vCPU pCPU vCPU pCPU nova-compute Linux process Dedicated for vCPUs
  • 19. Copyright © NTT Communications Corporation. ● OpenStack flavor has extra spec "hw:cpu_policy" ○ enables vCPU pinning 18 vCPU pinning $ openstack flavor set m1.large --property hw:cpu_policy=dedicated $ virsh vcpupin instance-00000002 VCPU: CPU Affinity ---------------------------------- 0: 1 1: 2 2: 3 3: 4 4: 5 5: 6 6: 7 7: 8 8: 9 $ virsh vcpupin instance-00000001 VCPU: CPU Affinity ---------------------------------- 0: 0-31 1: 0-31 2: 0-31 3: 0-31 4: 0-31 5: 0-31 6: 0-31 7: 0-31 8: 0-31 Default allocation vCPU pinning
  • 20. Copyright © NTT Communications Corporation. Compute Host NUMA VM 19 Hugepages NUMA vCPU vCPU vCPU vCPU VF Memory hugepage CPU CPU CPU CPU CPU NICVF VF Memory hugepage NUMA CPU CPU CPU CPU CPU NICVF VF Memory hugepage
  • 21. Copyright © NTT Communications Corporation. ● What is Hugepages? ○ segmented pages in memory from 4KB to larger size ■ Reduces TLB misses ■ DPDK applications usually use Hugepages 20 Hugepages page page page page virtual virtual virtual virtual physical physical physical physical page page page page Virtual address TLB Physical address Page table TLB miss
  • 22. Copyright © NTT Communications Corporation. ● OpenStack flavor has extra spec "hw:mem_page_size" ○ Enables Hugepages and assign to guest 21 Hugepages $ openstack flavor set m1.large --property hw:mem_page_size=1048576 $ cat /etc/libvirt/qemu/instance-00000002.xml | grep hugepages -1 <memoryBacking> <hugepages> <page size='1048576' unit='KiB' /> </hugepages> </memoryBacking> $ cat /proc/meminfo | grep Hugepagesize Hugepagesize: 1048576 kB
  • 23. Copyright © NTT Communications Corporation. ● Optimization feature for DPDK ○ SSSE3, SSE4,... ● "[libvirt] cpu_mode" option in nova.conf ○ By default, none is set in some distribution ○ host-model, host-passthrough, or custom is required 22 Other CPU features $ cat /proc/cpuinfo | grep -e model name -e flags model name : Intel Core Processor (Broadwell) flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat $ cat /proc/cpuinfo | grep -e model name -e flags model name : QEMU Virtual CPU version 2.0.0 flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni vmx cx16 x2apic popcnt hypervisor lahf_lm vnmi ept cpu_mode=none cpu_mode=host-model
  • 24. Copyright © NTT Communications Corporation. 23 Agenda ● Background ● Goal / Actions ● Kamuee (Software router) ● DPDK application on OpenStack ● Benchmark ○ Environment ○ Baremetal performance ○ VM + VF performance ○ VM +PF performance ○ Baremetal (VF exists) performance
  • 25. Copyright © NTT Communications Corporation. 24 Environment: Hardware ● Server ○ Dell PowerEdge R630 ■ Intel® Xeon® CPU E5-2690 v4 @ 2.60GHz (14 cores) * 2 ■ DDR-4 256GB (32GB * 8) ■ Ubuntu 16.04 ● NIC ○ Mellanox ConnectX-4 100Gb/s Dual-Port Adapter ■ 1 PCIe Card, 100G Ports * 2 ● Switch ○ Dell Networking Z9100 ■ Cumulus Linux 3.2.0 ■ 100Gbps Port * 32
  • 26. Copyright © NTT Communications Corporation. 25 Environment: Architecture ② ③ Switch Kamuee pktgen dpdk 0 VLAN100 ※Each line is 100G link ConnectX-4 100G 2 port (using 2ports) ConnectX-4 100G 2 port * 2 (using only each 1port) NIC port0 port1 pktgen dpdk 1 NIC port0 port1 NIC 0 port0 port1 NIC 1 port0 port1 ① ② ③ ① VLAN200 nexthop 0NIC nexthop 1 port0 port1 NIC port0 port1 ④ ④
  • 27. Copyright © NTT Communications Corporation. 26 Environment: pktgen-dpdk ● Open source packet generator ○ Output: about 100Mpps≒67.2Gbps/server (64Byte packet) ■ 50Mpps/port ○ dst mac ■ kamuee NIC0 port0 (port0-1 on pktgen-dpdk 0) ■ kamuee NIC1 port0 (port0-1 on pktgen-dpdk 1) ○ dst ip (range) ■ 1.0.0.1-254 (port0 on each server) ■ 1.0.4.1-254 (port1 on each server) ○ dst TCP port (range) ■ 1-254 (port0 on each server) ■ 256-510 (port1 on each server)
  • 28. Copyright © NTT Communications Corporation. 27 Environment: Kamuee ● DPDK software router ● Spec configuration ○ 2 NUMA nodes ○ Using 26 cores ■ Forwarding: 12 cores/port * 2 (each NUMA) ■ Other functions: 2 cores ○ Using 16GB memory ■ 1GB Hugepages * 8 * 2 (each NUMA) ○ 2 NICs ■ only port 0 is used * 2 (each NUMA)
  • 29. Copyright © NTT Communications Corporation. 28 Environment: Kamuee ● Routing configuration ○ 518K routes (like Fullroute) loaded ■ Forwading to nexthop server ● DPDK EAL options ○ ./kamuee -n 4 --socket-mem 8192,8192 -w 0000:00:05.0,txq_inline=128 -w 0000:00:06.0,txq_inline=128 kamuee-console> show ipv4 route 1.0.0.0/24 nexthop: 172.21.4.105 1.0.4.0/24 nexthop: 172.21.3.104 ... kamuee-console> show ipv4 route 172.21.4.105 172.21.4.105/32 ether: 24:8a:07:4c:2f:64 port 1 kamuee-console> show ipv4 route 172.21.3.104 172.21.3.104/32 ether: 24:8a:07:4c:2f:6c port 0
  • 30. Copyright © NTT Communications Corporation. 29 Environment: nexthop ● Measuring RX packets ○ Using eth_stat.sh ■ https://community.mellanox.com/docs/DOC-2506#jive_content_id_ How_to_Measure_Ingress_Rate ■ using "rx_packets_phy" on ethtool ● hardware-level packet counter
  • 31. Copyright © NTT Communications Corporation. 30 Environment: Ideal flow on each pktgen server (64Byte) ③:33.6Gbps Switch Kamuee pktgen dpdk 0 VLAN100 ConnectX-4 100G 2 port (using 2ports) ConnectX-4 100G 2 port * 2 (using only each 1port) NIC port0 port1 pktgen dpdk 1 NIC port0 port1 NIC 0 port0 port1 NIC 1 port0 port1 VLAN200 nexthop 0NIC nexthop 1 port0 port1 NIC port0 port1 ①:33.6Gbps ②:67.2Gbps ③:33.6Gbps ③:33.6Gbps ③:33.6Gbps ①:33.6Gbps
  • 32. Copyright © NTT Communications Corporation. 31 Environment: Ideal flow on each pktgen server (64Byte) ③:33.6Gbps Switch Kamuee pktgen dpdk 0 VLAN100 ConnectX-4 100G 2 port (using 2ports) ConnectX-4 100G 2 port * 2 (using only each 1port) NIC port0 port1 pktgen dpdk 1 NIC port0 port1 NIC 0 port0 port1 NIC 1 port0 port1 VLAN200 nexthop 0NIC nexthop 1 port0 port1 NIC port0 port1 ①:33.6Gbps ②:67.2Gbps ③:33.6Gbps ③:33.6Gbps ③:33.6Gbps ①:33.6Gbps
  • 33. Copyright © NTT Communications Corporation. 32 Environment: Ideal flow (64Byte) Switch Kamuee pktgen dpdk 0 VLAN100 ConnectX-4 100G 2 port (using 2ports) ConnectX-4 100G 2 port * 2 (using only each 1port) NIC port0 port1 pktgen dpdk 1 NIC port0 port1 NIC 0 port0 port1 NIC 1 port0 port1 VLAN200 nexthop 0NIC nexthop 1 port0 port1 NIC port0 port1 ①:33.6Gbps ②:67.2Gbps ③:67.2Gbps ③:67.2Gbps ③:67.2Gbps ①:33.6Gbps ①:33.6Gbps ①:33.6Gbps ③:67.2Gbps ②:67.2Gbps
  • 34. Copyright © NTT Communications Corporation. 33 Baremetal performance: Configuration ● BIOS ○ Hyper-Threading: OFF ● Boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=1-27 ● Mellanox ○ CQE_COMPRESSION: AGGRESSIVE(1) ○ SRIOV_EN: False(0) ● Ports ○ 2 PFs (only port0 on each NIC)
  • 35. Copyright © NTT Communications Corporation. 34 Baremetal performance: Result
  • 36. Copyright © NTT Communications Corporation. 35 VM + VF performance: Host Configuration ● BIOS ○ Hyper-Threading: OFF ○ VT-d: ON ● Host boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=1-27 intel_iommu=on ● Mellanox ○ CQE_COMPRESSION: AGGRESSIVE(1) ○ SRIOV_EN: True(1) ○ NUM_OF_VFS: 1
  • 37. Copyright © NTT Communications Corporation. 36 VM + VF performance: Guest Configuration ● Flavor ○ vCPUs: 27 ○ Memory: 32GB ○ extra_specs: ■ hw:cpu_policy: dedicated ■ hw:mem_page_size: 1048576 ■ hw:numa_mem.0: 16384 ■ hw:numa_mem.1: 16384 ■ hw:numa_cpus.0: 0-13 ■ hw:numa_cpus.1: 14-26 ■ hw:numa_nodes: 2 ● Ports ○ 2 VFs (vf 0 on each NIC port0) ● Guest boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-26 rcu_nocbs=1-26 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-26
  • 38. Copyright © NTT Communications Corporation. 37 VM + VF performance: Result
  • 39. Copyright © NTT Communications Corporation. 38 VM + PF performance: Host Configuration ● BIOS ○ Hyper-Threading: OFF ○ VT-d: ON ● Host boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=1-27 intel_iommu=on ● Mellanox ○ CQE_COMPRESSION: AGGRESSIVE(1) ○ SRIOV_EN: False(0)
  • 40. Copyright © NTT Communications Corporation. 39 VM + PF performance: Guest Configuration ● Flavor ○ vCPUs: 27 ○ Memory: 32GB ○ extra_specs: ■ hw:cpu_policy: dedicated ■ hw:mem_page_size: 1048576 ■ hw:numa_mem.0: 16384 ■ hw:numa_mem.1: 16384 ■ hw:numa_cpus.0: 0-13 ■ hw:numa_cpus.1: 14-26 ■ hw:numa_nodes: 2 ● Ports ○ 2 PFs (only port0 on each NIC with PCI-Passthrough) ● Guest boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-26 rcu_nocbs=1-26 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=16 isolcpus=1-26
  • 41. Copyright © NTT Communications Corporation. 40 VM + PF performance: Result
  • 42. Copyright © NTT Communications Corporation. 41 Baremetal (VF exists) performance: Configuration ● BIOS ○ Hyper-Threading: OFF ● Boot parameters ○ intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=1-27 rcu_nocbs=1-27 rcu_novb_poll audit=0 nosoftlockup default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=1-27 ● Mellanox ○ CQE_COMPRESSION: AGGRESSIVE(1) ○ SRIOV_EN: True(1) ○ NUM_OF_VFS: 1 ● Ports ○ 2 PFs (only port0 on each NIC) ○ 2 VFs (vf 0 on each NIC port0) exists [not used]
  • 43. Copyright © NTT Communications Corporation. 42 Baremetal (VF exists) performance: Result
  • 44. Copyright © NTT Communications Corporation. 43 All Results
  • 45. Copyright © NTT Communications Corporation. 44 Conclusion ● OpenStack functions for NFV works fine ○ SR-IOV port assignment ○ NUMA awareness ○ vCPU pinning ○ Hugepages ○ CPU Feature ● KVM + Intel VT archive close to baremetal performance ● SR-IOV performance evaluation is required ○ SR-IOV device implementation depends on its vendor
  • 46. Copyright © NTT Communications Corporation. 45 Conclusion ● Our decision ○ VM + PF is powerful option ■ SR-IOV advantange ● Multiple VF can be created ○ Router ○ Firewall ○ Load balancer ○ ... ■ 100G router consumes almost host resources ● "1 Host: 1 VM" is realistic option ○ no need so many ports
  • 47. Copyright © NTT Communications Corporation. 46 Thank you!
  • 48. Copyright © NTT Communications Corporation. 47 References ● SR-IOV ○ https://docs.openstack.org/ocata/networking-guide/config-sriov.html ● How to enable SR-IOV with Mellanox NIC ○ https://community.mellanox.com/docs/DOC-2386 ● Hugepages ○ https://www.mirantis.com/blog/mirantis-openstack-7-0-nfvi-deployment-gui de-huge-pages/ ● isolcpu & cpupinning ○ https://docs.mirantis.com/mcp/1.0/mcp-deployment-guide/enable-numa-a nd-cpu-pinning/enable-numa-and-cpu-pinning-procedure.html ● NUMA ○ https://docs.openstack.org/nova/pike/admin/cpu-topologies.html