Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.
1 Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature 2 supports Ethernet functionality over Omni-Path fabric by encapsulating 3 the Ethernet packets between HFI nodes. 4 5 Architecture 6 ============= 7 The patterns of exchanges of Omni-Path encapsulated Ethernet packets 8 involves one or more virtual Ethernet switches overlaid on the Omni-Path 9 fabric topology. A subset of HFI nodes on the Omni-Path fabric are 10 permitted to exchange encapsulated Ethernet packets across a particular 11 virtual Ethernet switch. The virtual Ethernet switches are logical 12 abstractions achieved by configuring the HFI nodes on the fabric for 13 header generation and processing. In the simplest configuration all HFI 14 nodes across the fabric exchange encapsulated Ethernet packets over a 15 single virtual Ethernet switch. A virtual Ethernet switch, is effectively 16 an independent Ethernet network. The configuration is performed by an 17 Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM) 18 application. HFI nodes can have multiple VNICs each connected to a 19 different virtual Ethernet switch. The below diagram presents a case 20 of two virtual Ethernet switches with two HFI nodes. 21 22 +-------------------+ 23 | Subnet/ | 24 | Ethernet | 25 | Manager | 26 +-------------------+ 27 / / 28 / / 29 / / 30 / / 31 +-----------------------------+ +------------------------------+ 32 | Virtual Ethernet Switch | | Virtual Ethernet Switch | 33 | +---------+ +---------+ | | +---------+ +---------+ | 34 | | VPORT | | VPORT | | | | VPORT | | VPORT | | 35 +--+---------+----+---------+-+ +-+---------+----+---------+---+ 36 | \ / | 37 | \ / | 38 | \/ | 39 | / \ | 40 | / \ | 41 +-----------+------------+ +-----------+------------+ 42 | VNIC | VNIC | | VNIC | VNIC | 43 +-----------+------------+ +-----------+------------+ 44 | HFI | | HFI | 45 +------------------------+ +------------------------+ 46 47 48 The Omni-Path encapsulated Ethernet packet format is as described below. 49 50 Bits Field 51 ------------------------------------ 52 Quad Word 0: 53 0-19 SLID (lower 20 bits) 54 20-30 Length (in Quad Words) 55 31 BECN bit 56 32-51 DLID (lower 20 bits) 57 52-56 SC (Service Class) 58 57-59 RC (Routing Control) 59 60 FECN bit 60 61-62 L2 (=10, 16B format) 61 63 LT (=1, Link Transfer Head Flit) 62 63 Quad Word 1: 64 0-7 L4 type (=0x78 ETHERNET) 65 8-11 SLID[23:20] 66 12-15 DLID[23:20] 67 16-31 PKEY 68 32-47 Entropy 69 48-63 Reserved 70 71 Quad Word 2: 72 0-15 Reserved 73 16-31 L4 header 74 32-63 Ethernet Packet 75 76 Quad Words 3 to N-1: 77 0-63 Ethernet packet (pad extended) 78 79 Quad Word N (last): 80 0-23 Ethernet packet (pad extended) 81 24-55 ICRC 82 56-61 Tail 83 62-63 LT (=01, Link Transfer Tail Flit) 84 85 Ethernet packet is padded on the transmit side to ensure that the VNIC OPA 86 packet is quad word aligned. The 'Tail' field contains the number of bytes 87 padded. On the receive side the 'Tail' field is read and the padding is 88 removed (along with ICRC, Tail and OPA header) before passing packet up 89 the network stack. 90 91 The L4 header field contains the virtual Ethernet switch id the VNIC port 92 belongs to. On the receive side, this field is used to de-multiplex the 93 received VNIC packets to different VNIC ports. 94 95 Driver Design 96 ============== 97 Intel OPA VNIC software design is presented in the below diagram. 98 OPA VNIC functionality has a HW dependent component and a HW 99 independent component. 100 101 The support has been added for IB device to allocate and free the RDMA 102 netdev devices. The RDMA netdev supports interfacing with the network 103 stack thus creating standard network interfaces. OPA_VNIC is an RDMA 104 netdev device type. 105 106 The HW dependent VNIC functionality is part of the HFI1 driver. It 107 implements the verbs to allocate and free the OPA_VNIC RDMA netdev. 108 It involves HW resource allocation/management for VNIC functionality. 109 It interfaces with the network stack and implements the required 110 net_device_ops functions. It expects Omni-Path encapsulated Ethernet 111 packets in the transmit path and provides HW access to them. It strips 112 the Omni-Path header from the received packets before passing them up 113 the network stack. It also implements the RDMA netdev control operations. 114 115 The OPA VNIC module implements the HW independent VNIC functionality. 116 It consists of two parts. The VNIC Ethernet Management Agent (VEMA) 117 registers itself with IB core as an IB client and interfaces with the 118 IB MAD stack. It exchanges the management information with the Ethernet 119 Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees 120 the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions 121 set by HW dependent VNIC driver where required to accommodate any control 122 operation. It also handles the encapsulation of Ethernet packets with an 123 Omni-Path header in the transmit path. For each VNIC interface, the 124 information required for encapsulation is configured by the EM via VEMA MAD 125 interface. It also passes any control information to the HW dependent driver 126 by invoking the RDMA netdev control operations. 127 128 +-------------------+ +----------------------+ 129 | | | Linux | 130 | IB MAD | | Network | 131 | | | Stack | 132 +-------------------+ +----------------------+ 133 | | | 134 | | | 135 +----------------------------+ | 136 | | | 137 | OPA VNIC Module | | 138 | (OPA VNIC RDMA Netdev | | 139 | & EMA functions) | | 140 | | | 141 +----------------------------+ | 142 | | 143 | | 144 +------------------+ | 145 | IB core | | 146 +------------------+ | 147 | | 148 | | 149 +--------------------------------------------+ 150 | | 151 | HFI1 Driver with VNIC support | 152 | | 153 +--------------------------------------------+