About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / crypto / async-tx-api.txt

Based on kernel version 2.6.26. Page generated on 2008-07-16 21:12 EST.

1			 Asynchronous Transfers/Transforms API
2	
3	1 INTRODUCTION
4	
5	2 GENEALOGY
6	
7	3 USAGE
8	3.1 General format of the API
9	3.2 Supported operations
10	3.3 Descriptor management
11	3.4 When does the operation execute?
12	3.5 When does the operation complete?
13	3.6 Constraints
14	3.7 Example
15	
16	4 DRIVER DEVELOPER NOTES
17	4.1 Conformance points
18	4.2 "My application needs finer control of hardware channels"
19	
20	5 SOURCE
21	
22	---
23	
24	1 INTRODUCTION
25	
26	The async_tx API provides methods for describing a chain of asynchronous
27	bulk memory transfers/transforms with support for inter-transactional
28	dependencies.  It is implemented as a dmaengine client that smooths over
29	the details of different hardware offload engine implementations.  Code
30	that is written to the API can optimize for asynchronous operation and
31	the API will fit the chain of operations to the available offload
32	resources.
33	
34	2 GENEALOGY
35	
36	The API was initially designed to offload the memory copy and
37	xor-parity-calculations of the md-raid5 driver using the offload engines
38	present in the Intel(R) Xscale series of I/O processors.  It also built
39	on the 'dmaengine' layer developed for offloading memory copies in the
40	network stack using Intel(R) I/OAT engines.  The following design
41	features surfaced as a result:
42	1/ implicit synchronous path: users of the API do not need to know if
43	   the platform they are running on has offload capabilities.  The
44	   operation will be offloaded when an engine is available and carried out
45	   in software otherwise.
46	2/ cross channel dependency chains: the API allows a chain of dependent
47	   operations to be submitted, like xor->copy->xor in the raid5 case.  The
48	   API automatically handles cases where the transition from one operation
49	   to another implies a hardware channel switch.
50	3/ dmaengine extensions to support multiple clients and operation types
51	   beyond 'memcpy'
52	
53	3 USAGE
54	
55	3.1 General format of the API:
56	struct dma_async_tx_descriptor *
57	async_<operation>(<op specific parameters>,
58			  enum async_tx_flags flags,
59	        	  struct dma_async_tx_descriptor *dependency,
60	        	  dma_async_tx_callback callback_routine,
61			  void *callback_parameter);
62	
63	3.2 Supported operations:
64	memcpy       - memory copy between a source and a destination buffer
65	memset       - fill a destination buffer with a byte value
66	xor          - xor a series of source buffers and write the result to a
67		       destination buffer
68	xor_zero_sum - xor a series of source buffers and set a flag if the
69		       result is zero.  The implementation attempts to prevent
70		       writes to memory
71	
72	3.3 Descriptor management:
73	The return value is non-NULL and points to a 'descriptor' when the operation
74	has been queued to execute asynchronously.  Descriptors are recycled
75	resources, under control of the offload engine driver, to be reused as
76	operations complete.  When an application needs to submit a chain of
77	operations it must guarantee that the descriptor is not automatically recycled
78	before the dependency is submitted.  This requires that all descriptors be
79	acknowledged by the application before the offload engine driver is allowed to
80	recycle (or free) the descriptor.  A descriptor can be acked by one of the
81	following methods:
82	1/ setting the ASYNC_TX_ACK flag if no child operations are to be submitted
83	2/ setting the ASYNC_TX_DEP_ACK flag to acknowledge the parent
84	   descriptor of a new operation.
85	3/ calling async_tx_ack() on the descriptor.
86	
87	3.4 When does the operation execute?
88	Operations do not immediately issue after return from the
89	async_<operation> call.  Offload engine drivers batch operations to
90	improve performance by reducing the number of mmio cycles needed to
91	manage the channel.  Once a driver-specific threshold is met the driver
92	automatically issues pending operations.  An application can force this
93	event by calling async_tx_issue_pending_all().  This operates on all
94	channels since the application has no knowledge of channel to operation
95	mapping.
96	
97	3.5 When does the operation complete?
98	There are two methods for an application to learn about the completion
99	of an operation.
100	1/ Call dma_wait_for_async_tx().  This call causes the CPU to spin while
101	   it polls for the completion of the operation.  It handles dependency
102	   chains and issuing pending operations.
103	2/ Specify a completion callback.  The callback routine runs in tasklet
104	   context if the offload engine driver supports interrupts, or it is
105	   called in application context if the operation is carried out
106	   synchronously in software.  The callback can be set in the call to
107	   async_<operation>, or when the application needs to submit a chain of
108	   unknown length it can use the async_trigger_callback() routine to set a
109	   completion interrupt/callback at the end of the chain.
110	
111	3.6 Constraints:
112	1/ Calls to async_<operation> are not permitted in IRQ context.  Other
113	   contexts are permitted provided constraint #2 is not violated.
114	2/ Completion callback routines cannot submit new operations.  This
115	   results in recursion in the synchronous case and spin_locks being
116	   acquired twice in the asynchronous case.
117	
118	3.7 Example:
119	Perform a xor->copy->xor operation where each operation depends on the
120	result from the previous operation:
121	
122	void complete_xor_copy_xor(void *param)
123	{
124		printk("complete\n");
125	}
126	
127	int run_xor_copy_xor(struct page **xor_srcs,
128			     int xor_src_cnt,
129			     struct page *xor_dest,
130			     size_t xor_len,
131			     struct page *copy_src,
132			     struct page *copy_dest,
133			     size_t copy_len)
134	{
135		struct dma_async_tx_descriptor *tx;
136	
137		tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len,
138			       ASYNC_TX_XOR_DROP_DST, NULL, NULL, NULL);
139		tx = async_memcpy(copy_dest, copy_src, 0, 0, copy_len,
140				  ASYNC_TX_DEP_ACK, tx, NULL, NULL);
141		tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len,
142			       ASYNC_TX_XOR_DROP_DST | ASYNC_TX_DEP_ACK | ASYNC_TX_ACK,
143			       tx, complete_xor_copy_xor, NULL);
144	
145		async_tx_issue_pending_all();
146	}
147	
148	See include/linux/async_tx.h for more information on the flags.  See the
149	ops_run_* and ops_complete_* routines in drivers/md/raid5.c for more
150	implementation examples.
151	
152	4 DRIVER DEVELOPMENT NOTES
153	4.1 Conformance points:
154	There are a few conformance points required in dmaengine drivers to
155	accommodate assumptions made by applications using the async_tx API:
156	1/ Completion callbacks are expected to happen in tasklet context
157	2/ dma_async_tx_descriptor fields are never manipulated in IRQ context
158	3/ Use async_tx_run_dependencies() in the descriptor clean up path to
159	   handle submission of dependent operations
160	
161	4.2 "My application needs finer control of hardware channels"
162	This requirement seems to arise from cases where a DMA engine driver is
163	trying to support device-to-memory DMA.  The dmaengine and async_tx
164	implementations were designed for offloading memory-to-memory
165	operations; however, there are some capabilities of the dmaengine layer
166	that can be used for platform-specific channel management.
167	Platform-specific constraints can be handled by registering the
168	application as a 'dma_client' and implementing a 'dma_event_callback' to
169	apply a filter to the available channels in the system.  Before showing
170	how to implement a custom dma_event callback some background of
171	dmaengine's client support is required.
172	
173	The following routines in dmaengine support multiple clients requesting
174	use of a channel:
175	- dma_async_client_register(struct dma_client *client)
176	- dma_async_client_chan_request(struct dma_client *client)
177	
178	dma_async_client_register takes a pointer to an initialized dma_client
179	structure.  It expects that the 'event_callback' and 'cap_mask' fields
180	are already initialized.
181	
182	dma_async_client_chan_request triggers dmaengine to notify the client of
183	all channels that satisfy the capability mask.  It is up to the client's
184	event_callback routine to track how many channels the client needs and
185	how many it is currently using.  The dma_event_callback routine returns a
186	dma_state_client code to let dmaengine know the status of the
187	allocation.
188	
189	Below is the example of how to extend this functionality for
190	platform-specific filtering of the available channels beyond the
191	standard capability mask:
192	
193	static enum dma_state_client
194	my_dma_client_callback(struct dma_client *client,
195				struct dma_chan *chan, enum dma_state state)
196	{
197		struct dma_device *dma_dev;
198		struct my_platform_specific_dma *plat_dma_dev;
199		
200		dma_dev = chan->device;
201		plat_dma_dev = container_of(dma_dev,
202					    struct my_platform_specific_dma,
203					    dma_dev);
204	
205		if (!plat_dma_dev->platform_specific_capability)
206			return DMA_DUP;
207	
208		. . .
209	}
210	
211	5 SOURCE
212	include/linux/dmaengine.h: core header file for DMA drivers and clients
213	drivers/dma/dmaengine.c: offload engine channel management routines
214	drivers/dma/: location for offload engine drivers
215	include/linux/async_tx.h: core header file for the async_tx api
216	crypto/async_tx/async_tx.c: async_tx interface to dmaengine and common code
217	crypto/async_tx/async_memcpy.c: copy offload
218	crypto/async_tx/async_memset.c: memory fill offload
219	crypto/async_tx/async_xor.c: xor and xor zero sum offload
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.