SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
Kernel Networking Walkthrough

Thomas Graf – Principal Software Engineer
Networking Services
Red Hat
Feb 7, 2014

1

Kernel Networking Walkthrough
Agenda
●

How does a packet get in and out of the net stack?
●

●

How does a packet get through the net stack?
●

●

2

RX Handler, IP Processing, TCP Processing, TCP
Fast Open

How to account for memory and do flow control?
●

●

NAPI, Busy Polling, RSS, RPS, XPS, GRO, TSO

Socket Buffers, Flow Control, TCP Small Queues

Q&A

Kernel Networking Walkthrough
Touring the Network Stack
Expectation

3

Reality

Kernel Networking Walkthrough
How does a packet get in and out of the Network
Stack?

4

Kernel Networking Walkthrough
Receive & Transmit Process

NIC

Network Stack
(Kernel Space)

Ring Buffer

Parse
IP

Parse
TCP/UDP

Socket Buffer read()

Forward

DMA

Device?
Ring Buffer

5

Local?

Process
(User Space)

Task
Construct
IP

Construct
TCP/UDP

Kernel Networking Walkthrough

write()
Socket Buffer
The 3 ways into the Network Stack
Interrupt Driven
Network
Stack

Ring Buffer

NAPI based Polling

poll()

Network
Stack

Ring Buffer

Busy Polling

busy_poll()

Task

Network
Stack

Ring Buffer

6

Kernel Networking Walkthrough
RSS – Receive Side Scaling
●

●

NIC distributes packets across multiple RX queues
allowing for parallel processing.
Separate IRQ per RX queue, thus selects CPU to
run hardware interrupt handler on.
RX-queue-1
CPU 1
RX-queue-2
CPU 3
filter

RX-queue-3
CPU 1
RX-queue-4
CPU 5

7

Kernel Networking Walkthrough
RPS – Receive Packet Steering
●

Software filter to select CPU # for processing

●

Use it to ...
... distribute single queue to
multiple CPUs

... redo queue - CPU mapping
RX-queue-1

RX-queue-2

RX-queue-3

RX-queue-4

8

CPU 1

CPU 1

CPU 2

CPU 2

CPU 3

CPU 3

Kernel Networking Walkthrough
Hardware Offload
●

RX/TX Checksumming
●

●

Virtual LAN filtering and tag stripping
●

●

9

Perform CPU intensive
checksumming in hardware.

Strip 802.1Q header and store VLAN
ID in network packet meta data.
Filter out unsubscribed VLANs.

Kernel Networking Walkthrough
Generic Receive Offload

NAPI based GRO
poll()

Network
Stack

Ring Buffer

GRO
MTU

10

Kernel Networking Walkthrough

Up to 64K
Segmentation Offload
Up to 64K

Network
Stack

Generic Segmentation Offload (GSO)

MTU

Ring Buffer

TCP Segmentation Offload (TSO)

MTU

11

Kernel Networking Walkthrough
How does a packet get through the Network
Stack?

(c) Karen Sagovac

12

Kernel Networking Walkthrough
Packet Processing
Link Layer

Packet Socket
ETH_P_ALL
Ingress QoS

tcpdump

Bridge
Open vSwitch

RX Handler

Team
Bonding
macvlan
macvtap

IPv4
Proto Handler

IPv6
ARP

Feast of the hungry chicks

IPX
Drop

13

Kernel Networking Walkthrough

...
IP Processing

PREROUTING
IP
Handler

INPUT

Route Lookup

Local Delivery

Forwarding

L4
(TCP, ...)

FORWARD
Route Lookup
Link Layer

IPv4
Construction

POSTROUTING

OUTPUT

14

Kernel Networking Walkthrough

Local Output

User
Space
TCP Processing
IP

Parse TCP
Lookup Socket

Socket Filter
socket locked
task exists

Receive TCP

Prequeue

process context ← softirq

Receive Socket Buffer
read()

poll()

Task

15

Backlog

Kernel Networking Walkthrough
TCP Fast Open

(net.ipv4.tcp_fastopen)
Regular

Fast Open

Client
1st Req

Server

Client
1st Req

SYN

ACK
SYN+

2x RTT

ACK+
HTTP
GE

Server

2x RTT
T

SYN

ookie
CK+C
A
SYN+
ACK+
HTTP
GET

Data

2nd Req

Data

2nd Req

SYN

1x RTT

ACK
SYN+

2x RTT

ACK+
HTTP
GE

T

Data

16

Kernel Networking Walkthrough

SYN+

Cook
ie+

HTTP

GET

+Data
+ACK
SYN
Memory Accounting & Flow Control

17

Kernel Networking Walkthrough
Socket Buffers & Flow Control
(net.ipv4.tcp_{r|w}mem)

ssh

ssh
Block or EWOULDBLOCK

write()

rmem -= packet-size

wmem
overlimit?

Socket Buffer

rmem += packet-size

wmem += packet-size

rmem
overlimit?

Socket Buffer
Reduce TCP Window

TCP/IP

TCP/IP

TX Ring Buffer
wmem -= packet-size

18

Kernel Networking Walkthrough

RX Ring Buffer
TCP Small Queues

(net.ipv4.tcp_limit_output_bytes)
ssh

torrent
write()

write()

Socket Buffer

Socket Buffer

TSQ: max 128Kb in flight per socket
TCP/IP

Queuing Discipline

Driver

TX Ring Buffer

19

Kernel Networking Walkthrough
Q&A

Feedback Page
●

http://devconf.cz/f/1

Coming Up Next:
NetworkManager for Enterprise
Dan Williams

20

Kernel Networking Walkthrough

Más contenido relacionado

La actualidad más candente

FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownScyllaDB
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
 
Open vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream KernelOpen vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream KernelNetronome
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and moreBrendan Gregg
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumScyllaDB
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsVipin Varghese
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 

La actualidad más candente (20)

FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance Showdown
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
Open vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream KernelOpen vSwitch Offload: Conntrack and the Upstream Kernel
Open vSwitch Offload: Conntrack and the Upstream Kernel
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpoints
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 

Similar a DevConf 2014 Kernel Networking Walkthrough

The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack monad bobo
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...PROIDEA
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Collaborate nfs kyle_final
Collaborate nfs kyle_finalCollaborate nfs kyle_final
Collaborate nfs kyle_finalKyle Hailey
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchChun Ming Ou
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PROIDEA
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 
Analise NetFlow in Real Time
Analise NetFlow in Real TimeAnalise NetFlow in Real Time
Analise NetFlow in Real TimePiotr Perzyna
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPThomas Graf
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantShixiong Shang
 

Similar a DevConf 2014 Kernel Networking Walkthrough (20)

The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Collaborate nfs kyle_final
Collaborate nfs kyle_finalCollaborate nfs kyle_final
Collaborate nfs kyle_final
 
Network
NetworkNetwork
Network
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable Switch
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Analise NetFlow in Real Time
Analise NetFlow in Real TimeAnalise NetFlow in Real Time
Analise NetFlow in Real Time
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
 
6lowpan
6lowpan6lowpan
6lowpan
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
 
Network Layer
Network LayerNetwork Layer
Network Layer
 

Más de Thomas Graf

BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFThomas Graf
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservicesThomas Graf
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityThomas Graf
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathThomas Graf
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containersThomas Graf
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSThomas Graf
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Thomas Graf
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful ServicesThomas Graf
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATThomas Graf
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThomas Graf
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingThomas Graf
 

Más de Thomas Graf (14)

BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable Datapath
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVS
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NAT
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

DevConf 2014 Kernel Networking Walkthrough

  • 1. Kernel Networking Walkthrough Thomas Graf – Principal Software Engineer Networking Services Red Hat Feb 7, 2014 1 Kernel Networking Walkthrough
  • 2. Agenda ● How does a packet get in and out of the net stack? ● ● How does a packet get through the net stack? ● ● 2 RX Handler, IP Processing, TCP Processing, TCP Fast Open How to account for memory and do flow control? ● ● NAPI, Busy Polling, RSS, RPS, XPS, GRO, TSO Socket Buffers, Flow Control, TCP Small Queues Q&A Kernel Networking Walkthrough
  • 3. Touring the Network Stack Expectation 3 Reality Kernel Networking Walkthrough
  • 4. How does a packet get in and out of the Network Stack? 4 Kernel Networking Walkthrough
  • 5. Receive & Transmit Process NIC Network Stack (Kernel Space) Ring Buffer Parse IP Parse TCP/UDP Socket Buffer read() Forward DMA Device? Ring Buffer 5 Local? Process (User Space) Task Construct IP Construct TCP/UDP Kernel Networking Walkthrough write() Socket Buffer
  • 6. The 3 ways into the Network Stack Interrupt Driven Network Stack Ring Buffer NAPI based Polling poll() Network Stack Ring Buffer Busy Polling busy_poll() Task Network Stack Ring Buffer 6 Kernel Networking Walkthrough
  • 7. RSS – Receive Side Scaling ● ● NIC distributes packets across multiple RX queues allowing for parallel processing. Separate IRQ per RX queue, thus selects CPU to run hardware interrupt handler on. RX-queue-1 CPU 1 RX-queue-2 CPU 3 filter RX-queue-3 CPU 1 RX-queue-4 CPU 5 7 Kernel Networking Walkthrough
  • 8. RPS – Receive Packet Steering ● Software filter to select CPU # for processing ● Use it to ... ... distribute single queue to multiple CPUs ... redo queue - CPU mapping RX-queue-1 RX-queue-2 RX-queue-3 RX-queue-4 8 CPU 1 CPU 1 CPU 2 CPU 2 CPU 3 CPU 3 Kernel Networking Walkthrough
  • 9. Hardware Offload ● RX/TX Checksumming ● ● Virtual LAN filtering and tag stripping ● ● 9 Perform CPU intensive checksumming in hardware. Strip 802.1Q header and store VLAN ID in network packet meta data. Filter out unsubscribed VLANs. Kernel Networking Walkthrough
  • 10. Generic Receive Offload NAPI based GRO poll() Network Stack Ring Buffer GRO MTU 10 Kernel Networking Walkthrough Up to 64K
  • 11. Segmentation Offload Up to 64K Network Stack Generic Segmentation Offload (GSO) MTU Ring Buffer TCP Segmentation Offload (TSO) MTU 11 Kernel Networking Walkthrough
  • 12. How does a packet get through the Network Stack? (c) Karen Sagovac 12 Kernel Networking Walkthrough
  • 13. Packet Processing Link Layer Packet Socket ETH_P_ALL Ingress QoS tcpdump Bridge Open vSwitch RX Handler Team Bonding macvlan macvtap IPv4 Proto Handler IPv6 ARP Feast of the hungry chicks IPX Drop 13 Kernel Networking Walkthrough ...
  • 14. IP Processing PREROUTING IP Handler INPUT Route Lookup Local Delivery Forwarding L4 (TCP, ...) FORWARD Route Lookup Link Layer IPv4 Construction POSTROUTING OUTPUT 14 Kernel Networking Walkthrough Local Output User Space
  • 15. TCP Processing IP Parse TCP Lookup Socket Socket Filter socket locked task exists Receive TCP Prequeue process context ← softirq Receive Socket Buffer read() poll() Task 15 Backlog Kernel Networking Walkthrough
  • 16. TCP Fast Open (net.ipv4.tcp_fastopen) Regular Fast Open Client 1st Req Server Client 1st Req SYN ACK SYN+ 2x RTT ACK+ HTTP GE Server 2x RTT T SYN ookie CK+C A SYN+ ACK+ HTTP GET Data 2nd Req Data 2nd Req SYN 1x RTT ACK SYN+ 2x RTT ACK+ HTTP GE T Data 16 Kernel Networking Walkthrough SYN+ Cook ie+ HTTP GET +Data +ACK SYN
  • 17. Memory Accounting & Flow Control 17 Kernel Networking Walkthrough
  • 18. Socket Buffers & Flow Control (net.ipv4.tcp_{r|w}mem) ssh ssh Block or EWOULDBLOCK write() rmem -= packet-size wmem overlimit? Socket Buffer rmem += packet-size wmem += packet-size rmem overlimit? Socket Buffer Reduce TCP Window TCP/IP TCP/IP TX Ring Buffer wmem -= packet-size 18 Kernel Networking Walkthrough RX Ring Buffer
  • 19. TCP Small Queues (net.ipv4.tcp_limit_output_bytes) ssh torrent write() write() Socket Buffer Socket Buffer TSQ: max 128Kb in flight per socket TCP/IP Queuing Discipline Driver TX Ring Buffer 19 Kernel Networking Walkthrough
  • 20. Q&A Feedback Page ● http://devconf.cz/f/1 Coming Up Next: NetworkManager for Enterprise Dan Williams 20 Kernel Networking Walkthrough