About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / infiniband / opa_vnic.txt


Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.

1	Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature
2	supports Ethernet functionality over Omni-Path fabric by encapsulating
3	the Ethernet packets between HFI nodes.
4	
5	Architecture
6	=============
7	The patterns of exchanges of Omni-Path encapsulated Ethernet packets
8	involves one or more virtual Ethernet switches overlaid on the Omni-Path
9	fabric topology. A subset of HFI nodes on the Omni-Path fabric are
10	permitted to exchange encapsulated Ethernet packets across a particular
11	virtual Ethernet switch. The virtual Ethernet switches are logical
12	abstractions achieved by configuring the HFI nodes on the fabric for
13	header generation and processing. In the simplest configuration all HFI
14	nodes across the fabric exchange encapsulated Ethernet packets over a
15	single virtual Ethernet switch. A virtual Ethernet switch, is effectively
16	an independent Ethernet network. The configuration is performed by an
17	Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
18	application. HFI nodes can have multiple VNICs each connected to a
19	different virtual Ethernet switch. The below diagram presents a case
20	of two virtual Ethernet switches with two HFI nodes.
21	
22	                             +-------------------+
23	                             |      Subnet/      |
24	                             |     Ethernet      |
25	                             |      Manager      |
26	                             +-------------------+
27	                                /          /
28	                              /           /
29	                            /            /
30	                          /             /
31	+-----------------------------+  +------------------------------+
32	|  Virtual Ethernet Switch    |  |  Virtual Ethernet Switch     |
33	|  +---------+    +---------+ |  | +---------+    +---------+   |
34	|  | VPORT   |    |  VPORT  | |  | |  VPORT  |    |  VPORT  |   |
35	+--+---------+----+---------+-+  +-+---------+----+---------+---+
36	         |                 \        /                 |
37	         |                   \    /                   |
38	         |                     \/                     |
39	         |                    /  \                    |
40	         |                  /      \                  |
41	     +-----------+------------+  +-----------+------------+
42	     |   VNIC    |    VNIC    |  |    VNIC   |    VNIC    |
43	     +-----------+------------+  +-----------+------------+
44	     |          HFI           |  |          HFI           |
45	     +------------------------+  +------------------------+
46	
47	
48	The Omni-Path encapsulated Ethernet packet format is as described below.
49	
50	Bits          Field
51	------------------------------------
52	Quad Word 0:
53	0-19      SLID (lower 20 bits)
54	20-30     Length (in Quad Words)
55	31        BECN bit
56	32-51     DLID (lower 20 bits)
57	52-56     SC (Service Class)
58	57-59     RC (Routing Control)
59	60        FECN bit
60	61-62     L2 (=10, 16B format)
61	63        LT (=1, Link Transfer Head Flit)
62	
63	Quad Word 1:
64	0-7       L4 type (=0x78 ETHERNET)
65	8-11      SLID[23:20]
66	12-15     DLID[23:20]
67	16-31     PKEY
68	32-47     Entropy
69	48-63     Reserved
70	
71	Quad Word 2:
72	0-15      Reserved
73	16-31     L4 header
74	32-63     Ethernet Packet
75	
76	Quad Words 3 to N-1:
77	0-63      Ethernet packet (pad extended)
78	
79	Quad Word N (last):
80	0-23      Ethernet packet (pad extended)
81	24-55     ICRC
82	56-61     Tail
83	62-63     LT (=01, Link Transfer Tail Flit)
84	
85	Ethernet packet is padded on the transmit side to ensure that the VNIC OPA
86	packet is quad word aligned. The 'Tail' field contains the number of bytes
87	padded. On the receive side the 'Tail' field is read and the padding is
88	removed (along with ICRC, Tail and OPA header) before passing packet up
89	the network stack.
90	
91	The L4 header field contains the virtual Ethernet switch id the VNIC port
92	belongs to. On the receive side, this field is used to de-multiplex the
93	received VNIC packets to different VNIC ports.
94	
95	Driver Design
96	==============
97	Intel OPA VNIC software design is presented in the below diagram.
98	OPA VNIC functionality has a HW dependent component and a HW
99	independent component.
100	
101	The support has been added for IB device to allocate and free the RDMA
102	netdev devices. The RDMA netdev supports interfacing with the network
103	stack thus creating standard network interfaces. OPA_VNIC is an RDMA
104	netdev device type.
105	
106	The HW dependent VNIC functionality is part of the HFI1 driver. It
107	implements the verbs to allocate and free the OPA_VNIC RDMA netdev.
108	It involves HW resource allocation/management for VNIC functionality.
109	It interfaces with the network stack and implements the required
110	net_device_ops functions. It expects Omni-Path encapsulated Ethernet
111	packets in the transmit path and provides HW access to them. It strips
112	the Omni-Path header from the received packets before passing them up
113	the network stack. It also implements the RDMA netdev control operations.
114	
115	The OPA VNIC module implements the HW independent VNIC functionality.
116	It consists of two parts. The VNIC Ethernet Management Agent (VEMA)
117	registers itself with IB core as an IB client and interfaces with the
118	IB MAD stack. It exchanges the management information with the Ethernet
119	Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees
120	the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions
121	set by HW dependent VNIC driver where required to accommodate any control
122	operation. It also handles the encapsulation of Ethernet packets with an
123	Omni-Path header in the transmit path. For each VNIC interface, the
124	information required for encapsulation is configured by the EM via VEMA MAD
125	interface. It also passes any control information to the HW dependent driver
126	by invoking the RDMA netdev control operations.
127	
128	        +-------------------+ +----------------------+
129	        |                   | |       Linux          |
130	        |     IB MAD        | |      Network         |
131	        |                   | |       Stack          |
132	        +-------------------+ +----------------------+
133	                 |               |          |
134	                 |               |          |
135	        +----------------------------+      |
136	        |                            |      |
137	        |      OPA VNIC Module       |      |
138	        |  (OPA VNIC RDMA Netdev     |      |
139	        |     & EMA functions)       |      |
140	        |                            |      |
141	        +----------------------------+      |
142	                    |                       |
143	                    |                       |
144	           +------------------+             |
145	           |     IB core      |             |
146	           +------------------+             |
147	                    |                       |
148	                    |                       |
149	        +--------------------------------------------+
150	        |                                            |
151	        |      HFI1 Driver with VNIC support         |
152	        |                                            |
153	        +--------------------------------------------+
Hide Line Numbers


About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog