About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / networking / segmentation-offloads.txt


Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.

1	Segmentation Offloads in the Linux Networking Stack
2	
3	Introduction
4	============
5	
6	This document describes a set of techniques in the Linux networking stack
7	to take advantage of segmentation offload capabilities of various NICs.
8	
9	The following technologies are described:
10	 * TCP Segmentation Offload - TSO
11	 * UDP Fragmentation Offload - UFO
12	 * IPIP, SIT, GRE, and UDP Tunnel Offloads
13	 * Generic Segmentation Offload - GSO
14	 * Generic Receive Offload - GRO
15	 * Partial Generic Segmentation Offload - GSO_PARTIAL
16	 * SCTP accelleration with GSO - GSO_BY_FRAGS
17	
18	TCP Segmentation Offload
19	========================
20	
21	TCP segmentation allows a device to segment a single frame into multiple
22	frames with a data payload size specified in skb_shinfo()->gso_size.
23	When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
24	SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
25	skb_shinfo()->gso_size should be set to a non-zero value.
26	
27	TCP segmentation is dependent on support for the use of partial checksum
28	offload.  For this reason TSO is normally disabled if the Tx checksum
29	offload for a given device is disabled.
30	
31	In order to support TCP segmentation offload it is necessary to populate
32	the network and transport header offsets of the skbuff so that the device
33	drivers will be able determine the offsets of the IP or IPv6 header and the
34	TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
35	also point to the TCP header of the packet.
36	
37	For IPv4 segmentation we support one of two types in terms of the IP ID.
38	The default behavior is to increment the IP ID with every segment.  If the
39	GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
40	ID and all segments will use the same IP ID.  If a device has
41	NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
42	and we will either increment the IP ID for all frames, or leave it at a
43	static value based on driver preference.
44	
45	UDP Fragmentation Offload
46	=========================
47	
48	UDP fragmentation offload allows a device to fragment an oversized UDP
49	datagram into multiple IPv4 fragments.  Many of the requirements for UDP
50	fragmentation offload are the same as TSO.  However the IPv4 ID for
51	fragments should not increment as a single IPv4 datagram is fragmented.
52	
53	UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
54	still receive them from tuntap and similar devices. Offload of UDP-based
55	tunnel protocols is still supported.
56	
57	IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
58	========================================================
59	
60	In addition to the offloads described above it is possible for a frame to
61	contain additional headers such as an outer tunnel.  In order to account
62	for such instances an additional set of segmentation offload types were
63	introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
64	SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
65	cases where there are more than just 1 set of headers.  For example in the
66	case of IPIP and SIT we should have the network and transport headers moved
67	from the standard list of headers to "inner" header offsets.
68	
69	Currently only two levels of headers are supported.  The convention is to
70	refer to the tunnel headers as the outer headers, while the encapsulated
71	data is normally referred to as the inner headers.  Below is the list of
72	calls to access the given headers:
73	
74	IPIP/SIT Tunnel:
75			Outer			Inner
76	MAC		skb_mac_header
77	Network		skb_network_header	skb_inner_network_header
78	Transport	skb_transport_header
79	
80	UDP/GRE Tunnel:
81			Outer			Inner
82	MAC		skb_mac_header		skb_inner_mac_header
83	Network		skb_network_header	skb_inner_network_header
84	Transport	skb_transport_header	skb_inner_transport_header
85	
86	In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
87	SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
88	fact that the outer header also requests to have a non-zero checksum
89	included in the outer header.
90	
91	Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
92	header has requested a remote checksum offload.  In this case the inner
93	headers will be left with a partial checksum and only the outer header
94	checksum will be computed.
95	
96	Generic Segmentation Offload
97	============================
98	
99	Generic segmentation offload is a pure software offload that is meant to
100	deal with cases where device drivers cannot perform the offloads described
101	above.  What occurs in GSO is that a given skbuff will have its data broken
102	out over multiple skbuffs that have been resized to match the MSS provided
103	via skb_shinfo()->gso_size.
104	
105	Before enabling any hardware segmentation offload a corresponding software
106	offload is required in GSO.  Otherwise it becomes possible for a frame to
107	be re-routed between devices and end up being unable to be transmitted.
108	
109	Generic Receive Offload
110	=======================
111	
112	Generic receive offload is the complement to GSO.  Ideally any frame
113	assembled by GRO should be segmented to create an identical sequence of
114	frames using GSO, and any sequence of frames segmented by GSO should be
115	able to be reassembled back to the original by GRO.  The only exception to
116	this is IPv4 ID in the case that the DF bit is set for a given IP header.
117	If the value of the IPv4 ID is not sequentially incrementing it will be
118	altered so that it is when a frame assembled via GRO is segmented via GSO.
119	
120	Partial Generic Segmentation Offload
121	====================================
122	
123	Partial generic segmentation offload is a hybrid between TSO and GSO.  What
124	it effectively does is take advantage of certain traits of TCP and tunnels
125	so that instead of having to rewrite the packet headers for each segment
126	only the inner-most transport header and possibly the outer-most network
127	header need to be updated.  This allows devices that do not support tunnel
128	offloads or tunnel offloads with checksum to still make use of segmentation.
129	
130	With the partial offload what occurs is that all headers excluding the
131	inner transport header are updated such that they will contain the correct
132	values for if the header was simply duplicated.  The one exception to this
133	is the outer IPv4 ID field.  It is up to the device drivers to guarantee
134	that the IPv4 ID field is incremented in the case that a given header does
135	not have the DF bit set.
136	
137	SCTP accelleration with GSO
138	===========================
139	
140	SCTP - despite the lack of hardware support - can still take advantage of
141	GSO to pass one large packet through the network stack, rather than
142	multiple small packets.
143	
144	This requires a different approach to other offloads, as SCTP packets
145	cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
146	IP segments, padding respected. So unlike regular GSO, SCTP can't just
147	generate a big skb, set gso_size to the fragmentation point and deliver it
148	to IP layer.
149	
150	Instead, the SCTP protocol layer builds an skb with the segments correctly
151	padded and stored as chained skbs, and skb_segment() splits based on those.
152	To signal this, gso_size is set to the special value GSO_BY_FRAGS.
153	
154	Therefore, any code in the core networking stack must be aware of the
155	possibility that gso_size will be GSO_BY_FRAGS and handle that case
156	appropriately.
157	
158	There are some helpers to make this easier:
159	
160	 - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
161	   an skb is an SCTP GSO skb.
162	
163	 - For size checks, the skb_gso_validate_*_len family of helpers correctly
164	   considers GSO_BY_FRAGS.
165	
166	 - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
167	   will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
168	
169	This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
170	set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.
Hide Line Numbers


About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog