SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
Kernel Networking Walkthrough

Thomas Graf – Principal Software Engineer
Networking Services
Red Hat
Feb 7, 2014

1

Kernel Networking Walkthrough
Agenda
●

How does a packet get in and out of the net stack?
●

●

How does a packet get through the net stack?
●

●

2

RX Handler, IP Processing, TCP Processing, TCP
Fast Open

How to account for memory and do flow control?
●

●

NAPI, Busy Polling, RSS, RPS, XPS, GRO, TSO

Socket Buffers, Flow Control, TCP Small Queues

Q&A

Kernel Networking Walkthrough
Touring the Network Stack
Expectation

3

Reality

Kernel Networking Walkthrough
How does a packet get in and out of the Network
Stack?

4

Kernel Networking Walkthrough
Receive & Transmit Process

NIC

Network Stack
(Kernel Space)

Ring Buffer

Parse
IP

Parse
TCP/UDP

Socket Buffer read()

Forward

DMA

Device?
Ring Buffer

5

Local?

Process
(User Space)

Task
Construct
IP

Construct
TCP/UDP

Kernel Networking Walkthrough

write()
Socket Buffer
The 3 ways into the Network Stack
Interrupt Driven
Network
Stack

Ring Buffer

NAPI based Polling

poll()

Network
Stack

Ring Buffer

Busy Polling

busy_poll()

Task

Network
Stack

Ring Buffer

6

Kernel Networking Walkthrough
RSS – Receive Side Scaling
●

●

NIC distributes packets across multiple RX queues
allowing for parallel processing.
Separate IRQ per RX queue, thus selects CPU to
run hardware interrupt handler on.
RX-queue-1
CPU 1
RX-queue-2
CPU 3
filter

RX-queue-3
CPU 1
RX-queue-4
CPU 5

7

Kernel Networking Walkthrough
RPS – Receive Packet Steering
●

Software filter to select CPU # for processing

●

Use it to ...
... distribute single queue to
multiple CPUs

... redo queue - CPU mapping
RX-queue-1

RX-queue-2

RX-queue-3

RX-queue-4

8

CPU 1

CPU 1

CPU 2

CPU 2

CPU 3

CPU 3

Kernel Networking Walkthrough
Hardware Offload
●

RX/TX Checksumming
●

●

Virtual LAN filtering and tag stripping
●

●

9

Perform CPU intensive
checksumming in hardware.

Strip 802.1Q header and store VLAN
ID in network packet meta data.
Filter out unsubscribed VLANs.

Kernel Networking Walkthrough
Generic Receive Offload

NAPI based GRO
poll()

Network
Stack

Ring Buffer

GRO
MTU

10

Kernel Networking Walkthrough

Up to 64K
Segmentation Offload
Up to 64K

Network
Stack

Generic Segmentation Offload (GSO)

MTU

Ring Buffer

TCP Segmentation Offload (TSO)

MTU

11

Kernel Networking Walkthrough
How does a packet get through the Network
Stack?

(c) Karen Sagovac

12

Kernel Networking Walkthrough
Packet Processing
Link Layer

Packet Socket
ETH_P_ALL
Ingress QoS

tcpdump

Bridge
Open vSwitch

RX Handler

Team
Bonding
macvlan
macvtap

IPv4
Proto Handler

IPv6
ARP

Feast of the hungry chicks

IPX
Drop

13

Kernel Networking Walkthrough

...
IP Processing

PREROUTING
IP
Handler

INPUT

Route Lookup

Local Delivery

Forwarding

L4
(TCP, ...)

FORWARD
Route Lookup
Link Layer

IPv4
Construction

POSTROUTING

OUTPUT

14

Kernel Networking Walkthrough

Local Output

User
Space
TCP Processing
IP

Parse TCP
Lookup Socket

Socket Filter
socket locked
task exists

Receive TCP

Prequeue

process context ← softirq

Receive Socket Buffer
read()

poll()

Task

15

Backlog

Kernel Networking Walkthrough
TCP Fast Open

(net.ipv4.tcp_fastopen)
Regular

Fast Open

Client
1st Req

Server

Client
1st Req

SYN

ACK
SYN+

2x RTT

ACK+
HTTP
GE

Server

2x RTT
T

SYN

ookie
CK+C
A
SYN+
ACK+
HTTP
GET

Data

2nd Req

Data

2nd Req

SYN

1x RTT

ACK
SYN+

2x RTT

ACK+
HTTP
GE

T

Data

16

Kernel Networking Walkthrough

SYN+

Cook
ie+

HTTP

GET

+Data
+ACK
SYN
Memory Accounting & Flow Control

17

Kernel Networking Walkthrough
Socket Buffers & Flow Control
(net.ipv4.tcp_{r|w}mem)

ssh

ssh
Block or EWOULDBLOCK

write()

rmem -= packet-size

wmem
overlimit?

Socket Buffer

rmem += packet-size

wmem += packet-size

rmem
overlimit?

Socket Buffer
Reduce TCP Window

TCP/IP

TCP/IP

TX Ring Buffer
wmem -= packet-size

18

Kernel Networking Walkthrough

RX Ring Buffer
TCP Small Queues

(net.ipv4.tcp_limit_output_bytes)
ssh

torrent
write()

write()

Socket Buffer

Socket Buffer

TSQ: max 128Kb in flight per socket
TCP/IP

Queuing Discipline

Driver

TX Ring Buffer

19

Kernel Networking Walkthrough
Q&A

Feedback Page
●

http://devconf.cz/f/1

Coming Up Next:
NetworkManager for Enterprise
Dan Williams

20

Kernel Networking Walkthrough

Más contenido relacionado

La actualidad más candente

The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack monad bobo
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPThomas Graf
 
How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on LinuxEtsuji Nakai
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceSUSE Labs Taipei
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
Virtualized network with openvswitch
Virtualized network with openvswitchVirtualized network with openvswitch
Virtualized network with openvswitchSim Janghoon
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 

La actualidad más candente (20)

Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on Linux
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Virtualized network with openvswitch
Virtualized network with openvswitchVirtualized network with openvswitch
Virtualized network with openvswitch
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
Deploying IPv6 on OpenStack
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 

Similar a DevConf 2014 Kernel Networking Walkthrough

Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...PROIDEA
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Collaborate nfs kyle_final
Collaborate nfs kyle_finalCollaborate nfs kyle_final
Collaborate nfs kyle_finalKyle Hailey
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchChun Ming Ou
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PROIDEA
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 
Analise NetFlow in Real Time
Analise NetFlow in Real TimeAnalise NetFlow in Real Time
Analise NetFlow in Real TimePiotr Perzyna
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPThomas Graf
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantShixiong Shang
 
SDN/OpenFlow #lspe
SDN/OpenFlow #lspeSDN/OpenFlow #lspe
SDN/OpenFlow #lspeChris Westin
 

Similar a DevConf 2014 Kernel Networking Walkthrough (20)

Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Collaborate nfs kyle_final
Collaborate nfs kyle_finalCollaborate nfs kyle_final
Collaborate nfs kyle_final
 
Network
NetworkNetwork
Network
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable Switch
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Analise NetFlow in Real Time
Analise NetFlow in Real TimeAnalise NetFlow in Real Time
Analise NetFlow in Real Time
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
 
6lowpan
6lowpan6lowpan
6lowpan
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
SDN/OpenFlow #lspe
SDN/OpenFlow #lspeSDN/OpenFlow #lspe
SDN/OpenFlow #lspe
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 

Más de Thomas Graf

BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFThomas Graf
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservicesThomas Graf
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityThomas Graf
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathThomas Graf
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containersThomas Graf
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSThomas Graf
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Thomas Graf
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful ServicesThomas Graf
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATThomas Graf
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThomas Graf
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingThomas Graf
 

Más de Thomas Graf (15)

BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable Datapath
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVS
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NAT
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Último (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

DevConf 2014 Kernel Networking Walkthrough

  • 1. Kernel Networking Walkthrough Thomas Graf – Principal Software Engineer Networking Services Red Hat Feb 7, 2014 1 Kernel Networking Walkthrough
  • 2. Agenda ● How does a packet get in and out of the net stack? ● ● How does a packet get through the net stack? ● ● 2 RX Handler, IP Processing, TCP Processing, TCP Fast Open How to account for memory and do flow control? ● ● NAPI, Busy Polling, RSS, RPS, XPS, GRO, TSO Socket Buffers, Flow Control, TCP Small Queues Q&A Kernel Networking Walkthrough
  • 3. Touring the Network Stack Expectation 3 Reality Kernel Networking Walkthrough
  • 4. How does a packet get in and out of the Network Stack? 4 Kernel Networking Walkthrough
  • 5. Receive & Transmit Process NIC Network Stack (Kernel Space) Ring Buffer Parse IP Parse TCP/UDP Socket Buffer read() Forward DMA Device? Ring Buffer 5 Local? Process (User Space) Task Construct IP Construct TCP/UDP Kernel Networking Walkthrough write() Socket Buffer
  • 6. The 3 ways into the Network Stack Interrupt Driven Network Stack Ring Buffer NAPI based Polling poll() Network Stack Ring Buffer Busy Polling busy_poll() Task Network Stack Ring Buffer 6 Kernel Networking Walkthrough
  • 7. RSS – Receive Side Scaling ● ● NIC distributes packets across multiple RX queues allowing for parallel processing. Separate IRQ per RX queue, thus selects CPU to run hardware interrupt handler on. RX-queue-1 CPU 1 RX-queue-2 CPU 3 filter RX-queue-3 CPU 1 RX-queue-4 CPU 5 7 Kernel Networking Walkthrough
  • 8. RPS – Receive Packet Steering ● Software filter to select CPU # for processing ● Use it to ... ... distribute single queue to multiple CPUs ... redo queue - CPU mapping RX-queue-1 RX-queue-2 RX-queue-3 RX-queue-4 8 CPU 1 CPU 1 CPU 2 CPU 2 CPU 3 CPU 3 Kernel Networking Walkthrough
  • 9. Hardware Offload ● RX/TX Checksumming ● ● Virtual LAN filtering and tag stripping ● ● 9 Perform CPU intensive checksumming in hardware. Strip 802.1Q header and store VLAN ID in network packet meta data. Filter out unsubscribed VLANs. Kernel Networking Walkthrough
  • 10. Generic Receive Offload NAPI based GRO poll() Network Stack Ring Buffer GRO MTU 10 Kernel Networking Walkthrough Up to 64K
  • 11. Segmentation Offload Up to 64K Network Stack Generic Segmentation Offload (GSO) MTU Ring Buffer TCP Segmentation Offload (TSO) MTU 11 Kernel Networking Walkthrough
  • 12. How does a packet get through the Network Stack? (c) Karen Sagovac 12 Kernel Networking Walkthrough
  • 13. Packet Processing Link Layer Packet Socket ETH_P_ALL Ingress QoS tcpdump Bridge Open vSwitch RX Handler Team Bonding macvlan macvtap IPv4 Proto Handler IPv6 ARP Feast of the hungry chicks IPX Drop 13 Kernel Networking Walkthrough ...
  • 14. IP Processing PREROUTING IP Handler INPUT Route Lookup Local Delivery Forwarding L4 (TCP, ...) FORWARD Route Lookup Link Layer IPv4 Construction POSTROUTING OUTPUT 14 Kernel Networking Walkthrough Local Output User Space
  • 15. TCP Processing IP Parse TCP Lookup Socket Socket Filter socket locked task exists Receive TCP Prequeue process context ← softirq Receive Socket Buffer read() poll() Task 15 Backlog Kernel Networking Walkthrough
  • 16. TCP Fast Open (net.ipv4.tcp_fastopen) Regular Fast Open Client 1st Req Server Client 1st Req SYN ACK SYN+ 2x RTT ACK+ HTTP GE Server 2x RTT T SYN ookie CK+C A SYN+ ACK+ HTTP GET Data 2nd Req Data 2nd Req SYN 1x RTT ACK SYN+ 2x RTT ACK+ HTTP GE T Data 16 Kernel Networking Walkthrough SYN+ Cook ie+ HTTP GET +Data +ACK SYN
  • 17. Memory Accounting & Flow Control 17 Kernel Networking Walkthrough
  • 18. Socket Buffers & Flow Control (net.ipv4.tcp_{r|w}mem) ssh ssh Block or EWOULDBLOCK write() rmem -= packet-size wmem overlimit? Socket Buffer rmem += packet-size wmem += packet-size rmem overlimit? Socket Buffer Reduce TCP Window TCP/IP TCP/IP TX Ring Buffer wmem -= packet-size 18 Kernel Networking Walkthrough RX Ring Buffer
  • 19. TCP Small Queues (net.ipv4.tcp_limit_output_bytes) ssh torrent write() write() Socket Buffer Socket Buffer TSQ: max 128Kb in flight per socket TCP/IP Queuing Discipline Driver TX Ring Buffer 19 Kernel Networking Walkthrough
  • 20. Q&A Feedback Page ● http://devconf.cz/f/1 Coming Up Next: NetworkManager for Enterprise Dan Williams 20 Kernel Networking Walkthrough