Google and Intel speak on NFV and SFC service delivery
The slides are as presented at the meet up "Out of Box Network Developers" sponsored by Intel Networking Developer Zone
Here is the Agenda of the slides:
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
1. Out of the Box Network Developers
SDN and Switching
SF Bay OpenStack
05/17/2016
Sujata Tibrewala
Network Developer Evangelist
Intel Developer Zone, Networking
Software.intel.com/networking
@intelsoftware
#sdnnfv
2. Upcoming events
OPNFV summit June 21st -23rd , Berlin, Germany
Red Hat Summit June 27th -30th , SFO, CA
DPDK summit August 10-11 www.dpdksummit.com, San Jose, CA
Intel developer Forum August
DPDK deep dive July 2016
3. Intel Team
Edwin Verplanke: Principle Engineer
Rashmin Patel : Software Architect
Priya V Autee: Software Engineer
Google Team
Jayant Kolhe: Director of Engineering
Abhishek Kumar : Engineering Lead & Manager at Google
Introductions
4. How RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
5. How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
7. SDN
Open Flow , ODP and ForCES (Forwarding and Control Element
Separation)
all perform similar functions High Level
● Separation of control and data plane
● Centralized management
● Programmable network behavior via well-defined interfaces
gRPC
DPDK, RDT, Quick
Assist etc
gRPC
9. How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
10. Software Defined Infrastructure
10000
feet
Enterprise
Cloud Service
Providers
Interconnect /
Switch
Processor
Crypto /
Compression
DRAM
Last Level Cache
Soft switch, Packet Processing
SW Optimizations
Interconnect /
Switch
Communication Infrastructure Cloud
Comms. Service
Providers
▪ Optimized I/O Access (Data Plane Development Kit)
▪ Intel® QuickAssist Technology for Crypto and
Compression Acceleration
▪ Virtualization Technology Enhancements
(Posted Interrupts, Page-Modification Logging)
▪ Intel® Resource Director Technology (CMT,
CAT, CDP, MBM)
11. 3*2*
Services Deployment on SDI
Service
x
Intel® Xeon® Processor E5 v4
Service
1
Service
2
Service
3
Service
1
Intel® Xeon® Processor E5 v4
Service
2
Service
3
Service
y
Call
Service
Call
Service
Call
Service
Call
Service
Call
Service
Call
Service
Call
Service
Call
Service Call
Service
Call
Service
Flexibility, Scalability, Service Agility, Resource Utilization
1* 4*
Call
Service
5000
feet
*can be a Process/Container/Pod/VM using a CPU core
13. How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
14. Orchestration Support
Service/API Support
Security Policy
Scheduler Policy
SW/FW Compatibility
Threading Model
Quality of Service
Shared Memory Access
Optimized I/O Access
SDI Platform Ingredients
Intel® Xeon® Processor E5 v4
VT RDT Memory
Controller
Cores NIC Crypto
HT
Platform SW/FW Ingredients
DPDK QAT OS Kernel
Optimizations
Standard Service Semantics
openssl libcrypto
OVS Hyperscan
.
.
.
.
.
.
.
.
DPDK – Data Plane Development Kit, QAT – Quick Assist Technology, RDT – Resource Director Technology, VT- Virtualization Technology, HT – Hyper Threading Technology, OVS – Open vSwitch, NFV – Network
Function Virtualization, SFC – Service Function Chaining
Orchestrator
15. Optimized Packet I/O API
Software solution for accelerating Packet
Processing
workloads on Intel® Architecture
• Delivers 25X performance jump over Linux*
• Comprehensive Virtualization support
• Enjoys vibrant community support
• Free, Open Source, BSD License
Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products.
Packet Processing Performance
Data Plane Development Kit (DPDK)
16. Process a bunch of packets during each software
iteration and amortize the access cost over
multiple packets
For memory access, use HW or SW controlled
prefetching. For PCIe access, use Data Direct
IO to write data directly into cache
Use access schemes that reduce the amount of
sharing (e.g. lockless queues for message
passing)
Page tables are constantly evicted (DTLB
Thrashing) – Allow Linux to use Huge Pages
(2MB, 1GB)
DPDK - Data Plane Development Kit
DPDK Overview
19. Current Infrastructure Support
Intel® Ethernet Network Adapter
* driver patch available
Xen
Virtual Machine
DPDK
Grant Table
e1000 dev mod
Enq / Deq shm
E1000_eth_pmd
Qemu DM
Shared
Memory
DPDK
vmexit()
VF_pmd
SRIOV
KVM
Virtual Machine
DPDK
ivshmem
vhost
E1000 dev mod
DPDK
Virtio_pmd
Qemu DM
E1000_eth_pmd
Enq/Deq shm
Shared
Memory
vmexit()
VF_pmd
SRIOV
VMware ESXi
Virtual Machine
DPDK
VMXNET3
e1000 dev mod
ESXi DM
VMware
vSwitch
VMXNET3_pmd
E1000_eth_pmd
Para Virtual
Interface
VF_pmd
SRIOV
vmexit()
Microsoft Hyper-V
Virtual Machine
Linux Drivers
Synthetic NIC
DEC 21140 dev mod
Hyper-V DM
Extensible
vSwitch
DEC 21140 Driver
Para Virtual
Interface
DPDK*
Synthetic NIC Driver
VF_Driver
SRIOV
vmexit()
20. Shared Resource Contention
• Last Level Cache is shared to make
best use of the resources in the
platform
• However certain types of applications
can cause noise and slow down others
• Applications streaming in nature can
cause excessive LLC evictions and
lead up to 51% of throughput
degradation of Network Workloads
Intel® Xeon® Processor E5 v4
Virtual Machine Monitor
Last Level Cache
Memory
Network IO
Crypto IO
21. Solution: IntelⓇ Resource Director Technology
Building on a rich and growing portfolio of technologies embedded in Intel silicon
LPHP
22. IntelⓇ Resource Director Technology (IntelⓇ RDT)
Core
app
Core
app
Last
Level
Cache
Core
DRAM
app
• Identify misbehaving
applications and
reschedule according to
priority
• Cache Occupancy
reported on a per
Resource Monitoring ID
(RMID) basis – Advanced
Telemetry
Cache Monitoring Technology
(CMT)
Core
app
Core
app
Last
Level
Cache
Core
DRAM
app
Cache Allocation Technology
(CAT)
• Last Level Cache
partitioning mechanism
enabling separation
and prioritization of
apps or VMs
• Misbehaving threads
can be isolated to
increase determinism
Core
app
Core
app
Last
Level
Cache
Core
app
Memory Bandwidth Monitoring
(MBM)
• Monitors Memory Bandwidth
consumption on per
thread/core/app basis
• Shares common RMID
architecture -- Telemetry
• Provides insight into second
order of shared resource
contention
DRAM
23. IntelⓇ RDT - University of California, Berkeley
http://span.cs.berkeley.edu
Load Generator
Intel® Xeon® processor
E5-2695 v4
Ethernet
Virtual
Machine Monitor Qemu
Virtual
Machine
EndRE
EthernetEthernet
…
Virtual
Machine
IPSec
Virtual
Machine
MazuNA
T
Virtual
Machine
SNORT
LLC
• UCB has been researching the applicability of Intel®
Resource Director Technology in Edge Device.
• Research focus on maintaining Quality of Service
while consolidating a variety of network centric
workloads
Core (ASIC-based, MPLS-like)
Handles scalable basic connectivity
(resilience, load balancing, anycast,
mcast,…)
SDN Controller
Support for 3rd party services
Partially at edge, partially in cloud
Edge Devices (x86, hybrid)
Handles all complex processing
(NFV, NetVirt, …)
Edge
Devic
es
Edge
Devic
es
Edge
Devic
es
Intel® Resource
Director Technology
24. IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor
E5-2695 v4
Ethernet
Virtual
Machine Monitor Qemu
Virtual
Machine
EndRE
EthernetEthernet
…
Virtual
Machine
IPSec
Virtual
Machine
MazuNA
T
Virtual
Machine
SNORT
LLC
• Network functions are executing simultaneously on isolated core’s,
throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
• LLC contention causes up to 51% performance degradation in
throughput
Max.% throughput degradation, normalizedhttp://span.cs.berkeley.edu
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components,
software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit
http://www.intel.com/performance/datacenter.
25. IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor
E5-2695 v4 Ethernet
Virtual
Machine Monitor Qemu
Virtual
Machine
EndRE
EthernetEthernet
…
Virtual
Machine
IPSec
Virtual
Machine
MazuNA
T
Virtual
Machine
SNORT
LLC
Max.% throughput degradation, normalized
• Network functions are executing simultaneously on isolated core’s,
throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
• VM under test is isolated utilizing CAT, 2 Ways of LLC are associated
with the Network function. Isolation only causes ~2% variation
http://span.cs.berkeley.edu
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components,
software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit
http://www.intel.com/performance/datacenter.
26. IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor
E5-2695 v4
Ethernet
Virtual
Machine Monitor Qemu
Virtual
Machine
EndRE
EthernetEthernet
…
Virtual
Machine
IPSec
Virtual
Machine
MazuNA
T
Virtual
Machine
SNORT
LLC
• Network functions are executing simultaneously on isolated
core’s, throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
LLC
LatencyinMicroseconds(logscale)
http://span.cs.berkeley.edu
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components,
software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit
http://www.intel.com/performance/datacenter.
27. Threads
cgroup fs
/sys/fs/cgroup/intel_rdt
Perf /
syscall(perf_event_open)
User interface
Cache Allocation Cache/Memory
Monitoring (Perf)
Intel® RDT support
Kernel Space
Hardware
MSR/
CPUID
Driver
Configure
bitmask per
CLOS
Set
CLOS/RMID
for thread
During ctx
switch
Allocation
configuration
Read
Event
counter
Read Monitored
data
Standalone PQoS
library
Intel® Xeon® Processor E5 v4 with Intel® RDT
IntelⓇ RDT Software Enabling Approaches
28. Broad Platform Awareness Enabling
• Linux cgroup/perf/libvirt enabling
cgroup: https://github.com/fyu1/linux/tree/cat16.1/
Perf: CMT mainstream(v4.1) and MBM mainstream(v4.6-rc1)
Libvirt patches: https://www.redhat.com/archives/libvir-list/2016-January/msg01264.html
• Standalone Intel® RDT API available (01.org)
https://github.com/01org/intel-cmt-cat
• DPDK API (dpdk.org) Intel® RDT enabling
examples/l2fwd-cat: RDT CAT and CDP, example of libpqos usage
29. How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients DPDK/RDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
30. Google confidential │ Do not
distribute
Google confidential │ Do not
distribute
gRPC:
A multi-platform RPC system
Abhishek Kumar
31. @grpcio
Mobile first
Software Defined Everything
Microservices Architecture
Everything as a service
Public Cloud
Internet of Things
gRPC touches and influences each of these areas.
High level trends
32. Google confidential │ Do not
distribute
Microservices at Google:
O(1010) RPCs per second.
Images by Connie
Zhou
33. Open source on Github for C, C++, Java, Node.js,
Python, Ruby, Go, C#, PHP, Objective-C
44. Google confidential │ Do not
distribute
Client-server
communication
Access Google Cloud
Services
Build distributed
applications
Images by Connie
Zhou
• In data-centers
• In public/private cloud
• HIgh performance
• Streaming
• Millions of
outstanding RPCs
• Cross-language API
framework
• Clients and servers
across:
• Mobile
• Web
• Cloud
• Also
• Embedded systems,
IoT
• From GCP
• From Android and iOS
devices
• From everywhere else
45. Some of the adopters
Microservices: in data centres
Streaming telemetry from network devices
Client Server communication
Client Server communication
46. @grpcio
MicroServices using gRPC
10 languages, Android and iOS platforms.Idiomatic, language-specific APIs
Ease of use and Scalability
Simple programming model. Protocol buffers for interface definition, data model
and wire encoding.
Multi-language
Streaming and High Performance
HTTP/2 framing and multiplexing with flow control. QUIC support.
Layered and Pluggable Architecture
Integrated load balancing, health checking, tracing across services
Support for different transports (HTTP/2-over-TCP, QUIC, etc.)
Plugin APIs for naming, stats, auth. etc.
48. Three complete stacks: C/C++, Java and Go.
Other language implementations wrap C-Runtime libraries.
Library API surface defined language-idiomatic way and hand-implemented on
top of wrapped C-Runtime libraries.
Initial choice of wrapping C Runtime gives us scale, performance in different
languages and ease of maintenance.
Implementation across languages
49. gRPC Core
Http 2.0
SSL
Code Generated API
Planned in:
C/C++, Java, GoApplication Layer
Framework Layer
Transport Adapter Layer
Architecture: Native Implementation in Language
TCP (Sockets)
Transport Layer
50. Generic Low Level API in C
Python
Code-Generated Language Idiomatic API
Obj-C, C#, C++,
...
Ruby PHPPython
gRPC Core in C
Http 2.0
SSL
Language Bindings
Code Generated
Ruby PHP
Obj-C, C#,
C++,...
Application Layer
Framework Layer
Transport Layer
Architecture: Derived Stack
51. Wire Implementation across languages
gRPC Core
Http 2.0
TLS/SSL
Code Generated API
Auth Architecture and API
Credentials API
Auth-Credentials
Implementation
AuthPluginAPI
52. Generic mechanism for attaching metadata to requests and responses
Built into the gRPC protocol - always available
Plugin API to attach “bearer tokens” to requests for Auth
OAuth2 access tokens
OIDC Id Tokens
Session state for specific Auth mechanisms is encapsulated in an Auth-
credentials object
Metadata Mechanism can be used for signaling up and down the stack
Metadata and Auth
53. How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients DPDK/RDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
54. Platform Exposure to gRPC Endpoints
gRPCStub
gRPC Core
Http 2.0
SSL
Code Generated API
TCP (Sockets)
Java Service
gRPCserver
gRPC Core
Http 2.0
SSL
Code Generated API
TCP (Sockets)
Golang Service
gRPCStub
Intel® Xeon® Processor E5 v4
Golang Service
gRPC server
Intel® Xeon® Processor E5 v4
Java Service
gRPC server
C++ Service
gRPC server
Intel® Xeon® Processor E5 v4
55. gRPC Core
Code Generated API
Application Layer
Framework Layer
Transport Adapter Layer
gRPC stack supporting IntelⓇ Resource Director Technology
Transport Layer
RDT
Set in Metadata
Extract RDT options
from metadata
Http 2.0
SSL
TCP (Sockets)
cgroup-perf
Http 2.0
SSL
TCP (Sockets)
RDT Mgr
DPDK +
IntelⓇ RDT +
Packet*
I/O Mgr
Set RDT options on
the socket
56. gRPC Core
Http 2.0
SSL
Code Generated API
Application Layer
Framework Layer
Transport Adapter Layer
gRPC Enhanced stack using DPDK
TCP (Sockets)
Transport Layer
DPDK-Crypto/
QAT Session
DPDKDPDK Sockets
DPDK Packet I/O Mgr
58. DPDK provides high performance I/O for SDN/NFV based workloads, has a
vibrant developers’ community and yields 25x performance over standard
Linux Network Stack
IntelⓇ Resource Director Technology enables developers and system admins
to monitor and control shared resources
gRPC a multi-platform RPC system with multi-language support and a high
performance pluggable architecture for services
Summary
62. IntelⓇ Resource Director Technology (IntelⓇ RDT)
Virtual
Machine Monitor Qemu
Intel® Xeon® processor
E5-2695 v4
Ethernet Ethernet Ethernet Ethernet
Virtual
Machine
“Noisy
Neighbor”
Virtual
Machine
“Noisy
Neighbor”
4 VNFs (VMs) with Simple Packet Pipeline
VNF – Virtual Network Function
Prioritizing Important Apps Without Cache
Allocation Technology LLC contention causes 38%
performance degradation, performance is restored
utilizing CAT
Another Benefit Average Latency is reduced from
36usec to 7usec after isolation of the noisy neighbors
63. Container Workload: A security sandbox performing DPI on a suspected packet stream
CAT Application on Containers
CAT - Cache Allocation Technology (IntelⓇ RDT Feature)
Number of Active Containers at
time t: 50/100/150
Each Container processing a
stream of packets/messages
from suspected packet dump
store
Containers’ Cache Pollution
Avg: 35-40MB
Max: 44MB