Based on kernel version 4.9. Page generated on 2016-12-21 14:36 EST.
1 Segmentation Offloads in the Linux Networking Stack 2 3 Introduction 4 ============ 5 6 This document describes a set of techniques in the Linux networking stack 7 to take advantage of segmentation offload capabilities of various NICs. 8 9 The following technologies are described: 10 * TCP Segmentation Offload - TSO 11 * UDP Fragmentation Offload - UFO 12 * IPIP, SIT, GRE, and UDP Tunnel Offloads 13 * Generic Segmentation Offload - GSO 14 * Generic Receive Offload - GRO 15 * Partial Generic Segmentation Offload - GSO_PARTIAL 16 17 TCP Segmentation Offload 18 ======================== 19 20 TCP segmentation allows a device to segment a single frame into multiple 21 frames with a data payload size specified in skb_shinfo()->gso_size. 22 When TCP segmentation requested the bit for either SKB_GSO_TCP or 23 SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and 24 skb_shinfo()->gso_size should be set to a non-zero value. 25 26 TCP segmentation is dependent on support for the use of partial checksum 27 offload. For this reason TSO is normally disabled if the Tx checksum 28 offload for a given device is disabled. 29 30 In order to support TCP segmentation offload it is necessary to populate 31 the network and transport header offsets of the skbuff so that the device 32 drivers will be able determine the offsets of the IP or IPv6 header and the 33 TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should 34 also point to the TCP header of the packet. 35 36 For IPv4 segmentation we support one of two types in terms of the IP ID. 37 The default behavior is to increment the IP ID with every segment. If the 38 GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP 39 ID and all segments will use the same IP ID. If a device has 40 NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO 41 and we will either increment the IP ID for all frames, or leave it at a 42 static value based on driver preference. 43 44 UDP Fragmentation Offload 45 ========================= 46 47 UDP fragmentation offload allows a device to fragment an oversized UDP 48 datagram into multiple IPv4 fragments. Many of the requirements for UDP 49 fragmentation offload are the same as TSO. However the IPv4 ID for 50 fragments should not increment as a single IPv4 datagram is fragmented. 51 52 IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads 53 ======================================================== 54 55 In addition to the offloads described above it is possible for a frame to 56 contain additional headers such as an outer tunnel. In order to account 57 for such instances an additional set of segmentation offload types were 58 introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and 59 SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify 60 cases where there are more than just 1 set of headers. For example in the 61 case of IPIP and SIT we should have the network and transport headers moved 62 from the standard list of headers to "inner" header offsets. 63 64 Currently only two levels of headers are supported. The convention is to 65 refer to the tunnel headers as the outer headers, while the encapsulated 66 data is normally referred to as the inner headers. Below is the list of 67 calls to access the given headers: 68 69 IPIP/SIT Tunnel: 70 Outer Inner 71 MAC skb_mac_header 72 Network skb_network_header skb_inner_network_header 73 Transport skb_transport_header 74 75 UDP/GRE Tunnel: 76 Outer Inner 77 MAC skb_mac_header skb_inner_mac_header 78 Network skb_network_header skb_inner_network_header 79 Transport skb_transport_header skb_inner_transport_header 80 81 In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and 82 SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the 83 fact that the outer header also requests to have a non-zero checksum 84 included in the outer header. 85 86 Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header 87 has requested a remote checksum offload. In this case the inner headers 88 will be left with a partial checksum and only the outer header checksum 89 will be computed. 90 91 Generic Segmentation Offload 92 ============================ 93 94 Generic segmentation offload is a pure software offload that is meant to 95 deal with cases where device drivers cannot perform the offloads described 96 above. What occurs in GSO is that a given skbuff will have its data broken 97 out over multiple skbuffs that have been resized to match the MSS provided 98 via skb_shinfo()->gso_size. 99 100 Before enabling any hardware segmentation offload a corresponding software 101 offload is required in GSO. Otherwise it becomes possible for a frame to 102 be re-routed between devices and end up being unable to be transmitted. 103 104 Generic Receive Offload 105 ======================= 106 107 Generic receive offload is the complement to GSO. Ideally any frame 108 assembled by GRO should be segmented to create an identical sequence of 109 frames using GSO, and any sequence of frames segmented by GSO should be 110 able to be reassembled back to the original by GRO. The only exception to 111 this is IPv4 ID in the case that the DF bit is set for a given IP header. 112 If the value of the IPv4 ID is not sequentially incrementing it will be 113 altered so that it is when a frame assembled via GRO is segmented via GSO. 114 115 Partial Generic Segmentation Offload 116 ==================================== 117 118 Partial generic segmentation offload is a hybrid between TSO and GSO. What 119 it effectively does is take advantage of certain traits of TCP and tunnels 120 so that instead of having to rewrite the packet headers for each segment 121 only the inner-most transport header and possibly the outer-most network 122 header need to be updated. This allows devices that do not support tunnel 123 offloads or tunnel offloads with checksum to still make use of segmentation. 124 125 With the partial offload what occurs is that all headers excluding the 126 inner transport header are updated such that they will contain the correct 127 values for if the header was simply duplicated. The one exception to this 128 is the outer IPv4 ID field. It is up to the device drivers to guarantee 129 that the IPv4 ID field is incremented in the case that a given header does 130 not have the DF bit set.