About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / scsi / st.txt

Custom Search

Based on kernel version 4.1. Page generated on 2015-06-28 12:14 EST.

1	This file contains brief information about the SCSI tape driver.
2	The driver is currently maintained by Kai Mäkisara (email
3	Kai.Makisara@kolumbus.fi)
5	Last modified: Sun Aug 29 18:25:47 2010 by kai.makisara
10	The driver is generic, i.e., it does not contain any code tailored
11	to any specific tape drive. The tape parameters can be specified with
12	one of the following three methods:
14	1. Each user can specify the tape parameters he/she wants to use
15	directly with ioctls. This is administratively a very simple and
16	flexible method and applicable to single-user workstations. However,
17	in a multiuser environment the next user finds the tape parameters in
18	state the previous user left them.
20	2. The system manager (root) can define default values for some tape
21	parameters, like block size and density using the MTSETDRVBUFFER ioctl.
22	These parameters can be programmed to come into effect either when a
23	new tape is loaded into the drive or if writing begins at the
24	beginning of the tape. The second method is applicable if the tape
25	drive performs auto-detection of the tape format well (like some
26	QIC-drives). The result is that any tape can be read, writing can be
27	continued using existing format, and the default format is used if
28	the tape is rewritten from the beginning (or a new tape is written
29	for the first time). The first method is applicable if the drive
30	does not perform auto-detection well enough and there is a single
31	"sensible" mode for the device. An example is a DAT drive that is
32	used only in variable block mode (I don't know if this is sensible
33	or not :-).
35	The user can override the parameters defined by the system
36	manager. The changes persist until the defaults again come into
37	effect.
39	3. By default, up to four modes can be defined and selected using the minor
40	number (bits 5 and 6). The number of modes can be changed by changing
41	ST_NBR_MODE_BITS in st.h. Mode 0 corresponds to the defaults discussed
42	above. Additional modes are dormant until they are defined by the
43	system manager (root). When specification of a new mode is started,
44	the configuration of mode 0 is used to provide a starting point for
45	definition of the new mode.
47	Using the modes allows the system manager to give the users choices
48	over some of the buffering parameters not directly accessible to the
49	users (buffered and asynchronous writes). The modes also allow choices
50	between formats in multi-tape operations (the explicitly overridden
51	parameters are reset when a new tape is loaded).
53	If more than one mode is used, all modes should contain definitions
54	for the same set of parameters.
56	Many Unices contain internal tables that associate different modes to
57	supported devices. The Linux SCSI tape driver does not contain such
58	tables (and will not do that in future). Instead of that, a utility
59	program can be made that fetches the inquiry data sent by the device,
60	scans its database, and sets up the modes using the ioctls. Another
61	alternative is to make a small script that uses mt to set the defaults
62	tailored to the system.
64	The driver supports fixed and variable block size (within buffer
65	limits). Both the auto-rewind (minor equals device number) and
66	non-rewind devices (minor is 128 + device number) are implemented.
68	In variable block mode, the byte count in write() determines the size
69	of the physical block on tape. When reading, the drive reads the next
70	tape block and returns to the user the data if the read() byte count
71	is at least the block size. Otherwise, error ENOMEM is returned.
73	In fixed block mode, the data transfer between the drive and the
74	driver is in multiples of the block size. The write() byte count must
75	be a multiple of the block size. This is not required when reading but
76	may be advisable for portability.
78	Support is provided for changing the tape partition and partitioning
79	of the tape with one or two partitions. By default support for
80	partitioned tape is disabled for each driver and it can be enabled
81	with the ioctl MTSETDRVBUFFER.
83	By default the driver writes one filemark when the device is closed after
84	writing and the last operation has been a write. Two filemarks can be
85	optionally written. In both cases end of data is signified by
86	returning zero bytes for two consecutive reads.
88	Writing filemarks without the immediate bit set in the SCSI command block acts
89	as a synchronization point, i.e., all remaining data form the drive buffers is
90	written to tape before the command returns. This makes sure that write errors
91	are caught at that point, but this takes time. In some applications, several
92	consecutive files must be written fast. The MTWEOFI operation can be used to
93	write the filemarks without flushing the drive buffer. Writing filemark at
94	close() is always flushing the drive buffers. However, if the previous
95	operation is MTWEOFI, close() does not write a filemark. This can be used if
96	the program wants to close/open the tape device between files and wants to
97	skip waiting.
99	If rewind, offline, bsf, or seek is done and previous tape operation was
100	write, a filemark is written before moving tape.
102	The compile options are defined in the file linux/drivers/scsi/st_options.h.
104	4. If the open option O_NONBLOCK is used, open succeeds even if the
105	drive is not ready. If O_NONBLOCK is not used, the driver waits for
106	the drive to become ready. If this does not happen in ST_BLOCK_SECONDS
107	seconds, open fails with the errno value EIO. With O_NONBLOCK the
108	device can be opened for writing even if there is a write protected
109	tape in the drive (commands trying to write something return error if
110	attempted).
115	The tape driver currently supports up to 2^17 drives if 4 modes for
116	each drive are used.
118	The minor numbers consist of the following bit fields:
120	dev_upper non-rew mode dev-lower
121	  20 -  8     7    6 5  4      0
122	The non-rewind bit is always bit 7 (the uppermost bit in the lowermost
123	byte). The bits defining the mode are below the non-rewind bit. The
124	remaining bits define the tape device number. This numbering is
125	backward compatible with the numbering used when the minor number was
126	only 8 bits wide.
131	The driver creates the directory /sys/class/scsi_tape and populates it with
132	directories corresponding to the existing tape devices. There are autorewind
133	and non-rewind entries for each mode. The names are stxy and nstxy, where x
134	is the tape number and y a character corresponding to the mode (none, l, m,
135	a). For example, the directories for the first tape device are (assuming four
136	modes): st0  nst0  st0l  nst0l  st0m  nst0m  st0a  nst0a.
138	Each directory contains the entries: default_blksize  default_compression
139	default_density  defined  dev  device  driver. The file 'defined' contains 1
140	if the mode is defined and zero if not defined. The files 'default_*' contain
141	the defaults set by the user. The value -1 means the default is not set. The
142	file 'dev' contains the device numbers corresponding to this device. The links
143	'device' and 'driver' point to the SCSI device and driver entries.
145	Each directory also contains the entry 'options' which shows the currently
146	enabled driver and mode options. The value in the file is a bit mask where the
147	bit definitions are the same as those used with MTSETDRVBUFFER in setting the
148	options.
150	A link named 'tape' is made from the SCSI device directory to the class
151	directory corresponding to the mode 0 auto-rewind device (e.g., st0). 
156	The user can choose between these two behaviours of the tape driver by
157	defining the value of the symbol ST_SYSV. The semantics differ when a
158	file being read is closed. The BSD semantics leaves the tape where it
159	currently is whereas the SYS V semantics moves the tape past the next
160	filemark unless the filemark has just been crossed.
162	The default is BSD semantics.
167	The driver tries to do transfers directly to/from user space. If this
168	is not possible, a driver buffer allocated at run-time is used. If
169	direct i/o is not possible for the whole transfer, the driver buffer
170	is used (i.e., bounce buffers for individual pages are not
171	used). Direct i/o can be impossible because of several reasons, e.g.:
172	- one or more pages are at addresses not reachable by the HBA
173	- the number of pages in the transfer exceeds the number of
174	  scatter/gather segments permitted by the HBA
175	- one or more pages can't be locked into memory (should not happen in
176	  any reasonable situation)
178	The size of the driver buffers is always at least one tape block. In fixed
179	block mode, the minimum buffer size is defined (in 1024 byte units) by
180	ST_FIXED_BUFFER_BLOCKS. With small block size this allows buffering of
181	several blocks and using one SCSI read or write to transfer all of the
182	blocks. Buffering of data across write calls in fixed block mode is
183	allowed if ST_BUFFER_WRITES is non-zero and direct i/o is not used.
184	Buffer allocation uses chunks of memory having sizes 2^n * (page
185	size). Because of this the actual buffer size may be larger than the
186	minimum allowable buffer size.
188	NOTE that if direct i/o is used, the small writes are not buffered. This may
189	cause a surprise when moving from 2.4. There small writes (e.g., tar without
190	-b option) may have had good throughput but this is not true any more with
191	2.6. Direct i/o can be turned off to solve this problem but a better solution
192	is to use bigger write() byte counts (e.g., tar -b 64).
194	Asynchronous writing. Writing the buffer contents to the tape is
195	started and the write call returns immediately. The status is checked
196	at the next tape operation. Asynchronous writes are not done with
197	direct i/o and not in fixed block mode.
199	Buffered writes and asynchronous writes may in some rare cases cause
200	problems in multivolume operations if there is not enough space on the
201	tape after the early-warning mark to flush the driver buffer.
203	Read ahead for fixed block mode (ST_READ_AHEAD). Filling the buffer is
204	attempted even if the user does not want to get all of the data at
205	this read command. Should be disabled for those drives that don't like
206	a filemark to truncate a read request or that don't like backspacing.
208	Scatter/gather buffers (buffers that consist of chunks non-contiguous
209	in the physical memory) are used if contiguous buffers can't be
210	allocated. To support all SCSI adapters (including those not
211	supporting scatter/gather), buffer allocation is using the following
212	three kinds of chunks:
213	1. The initial segment that is used for all SCSI adapters including
214	those not supporting scatter/gather. The size of this buffer will be
215	(PAGE_SIZE << ST_FIRST_ORDER) bytes if the system can give a chunk of
216	this size (and it is not larger than the buffer size specified by
217	ST_BUFFER_BLOCKS). If this size is not available, the driver halves
218	the size and tries again until the size of one page. The default
219	settings in st_options.h make the driver to try to allocate all of the
220	buffer as one chunk.
221	2. The scatter/gather segments to fill the specified buffer size are
222	allocated so that as many segments as possible are used but the number
223	of segments does not exceed ST_FIRST_SG.
224	3. The remaining segments between ST_MAX_SG (or the module parameter
225	max_sg_segs) and the number of segments used in phases 1 and 2
226	are used to extend the buffer at run-time if this is necessary. The
227	number of scatter/gather segments allowed for the SCSI adapter is not
228	exceeded if it is smaller than the maximum number of scatter/gather
229	segments specified. If the maximum number allowed for the SCSI adapter
230	is smaller than the number of segments used in phases 1 and 2,
231	extending the buffer will always fail.
236	When the end of medium early warning is encountered, the current write
237	is finished and the number of bytes is returned. The next write
238	returns -1 and errno is set to ENOSPC. To enable writing a trailer,
239	the next write is allowed to proceed and, if successful, the number of
240	bytes is returned. After this, -1 and the number of bytes are
241	alternately returned until the physical end of medium (or some other
242	error) is encountered.
247	The buffer size, write threshold, and the maximum number of allocated buffers
248	are configurable when the driver is loaded as a module. The keywords are:
250	buffer_kbs=xxx             the buffer size for fixed block mode is set
251				   to xxx kilobytes
252	write_threshold_kbs=xxx    the write threshold in kilobytes set to xxx
253	max_sg_segs=xxx		   the maximum number of scatter/gather
254				   segments
255	try_direct_io=x		   try direct transfer between user buffer and
256				   tape drive if this is non-zero
258	Note that if the buffer size is changed but the write threshold is not
259	set, the write threshold is set to the new buffer size - 2 kB.
264	If the driver is compiled into the kernel, the same parameters can be
265	also set using, e.g., the LILO command line. The preferred syntax is
266	to use the same keyword used when loading as module but prepended
267	with 'st.'. For instance, to set the maximum number of scatter/gather
268	segments, the parameter 'st.max_sg_segs=xx' should be used (xx is the
269	number of scatter/gather segments).
271	For compatibility, the old syntax from early 2.5 and 2.4 kernel
272	versions is supported. The same keywords can be used as when loading
273	the driver as module. If several parameters are set, the keyword-value
274	pairs are separated with a comma (no spaces allowed). A colon can be
275	used instead of the equal mark. The definition is prepended by the
276	string st=. Here is an example:
278		st=buffer_kbs:64,write_threshold_kbs:60
280	The following syntax used by the old kernel versions is also supported:
282	           st=aa[,bb[,dd]]
284	where
285	  aa is the buffer size for fixed block mode in 1024 byte units
286	  bb is the write threshold in 1024 byte units
287	  dd is the maximum number of scatter/gather segments
292	The tape is positioned and the drive parameters are set with ioctls
293	defined in mtio.h The tape control program 'mt' uses these ioctls. Try
294	to find an mt that supports all of the Linux SCSI tape ioctls and
295	opens the device for writing if the tape contents will be modified
296	(look for a package mt-st* from the Linux ftp sites; the GNU mt does
297	not open for writing for, e.g., erase).
299	The supported ioctls are:
301	The following use the structure mtop:
303	MTFSF   Space forward over count filemarks. Tape positioned after filemark.
304	MTFSFM  As above but tape positioned before filemark.
305	MTBSF	Space backward over count filemarks. Tape positioned before
306	        filemark.
307	MTBSFM  As above but ape positioned after filemark.
308	MTFSR   Space forward over count records.
309	MTBSR   Space backward over count records.
310	MTFSS   Space forward over count setmarks.
311	MTBSS   Space backward over count setmarks.
312	MTWEOF  Write count filemarks.
313	MTWEOFI	Write count filemarks with immediate bit set (i.e., does not
314		wait until data is on tape)
315	MTWSM   Write count setmarks.
316	MTREW   Rewind tape.
317	MTOFFL  Set device off line (often rewind plus eject).
318	MTNOP   Do nothing except flush the buffers.
319	MTRETEN Re-tension tape.
320	MTEOM   Space to end of recorded data.
321	MTERASE Erase tape. If the argument is zero, the short erase command
322		is used. The long erase command is used with all other values
323		of the argument.
324	MTSEEK	Seek to tape block count. Uses Tandberg-compatible seek (QFA)
325	        for SCSI-1 drives and SCSI-2 seek for SCSI-2 drives. The file and
326		block numbers in the status are not valid after a seek.
327	MTSETBLK Set the drive block size. Setting to zero sets the drive into
328	        variable block mode (if applicable).
329	MTSETDENSITY Sets the drive density code to arg. See drive
330	        documentation for available codes.
331	MTLOCK and MTUNLOCK Explicitly lock/unlock the tape drive door.
332	MTLOAD and MTUNLOAD Explicitly load and unload the tape. If the
333		command argument x is between MT_ST_HPLOADER_OFFSET + 1 and
334		MT_ST_HPLOADER_OFFSET + 6, the number x is used sent to the
335		drive with the command and it selects the tape slot to use of
336		HP C1553A changer.
337	MTCOMPRESSION Sets compressing or uncompressing drive mode using the
338		SCSI mode page 15. Note that some drives other methods for
339		control of compression. Some drives (like the Exabytes) use
340		density codes for compression control. Some drives use another
341		mode page but this page has not been implemented in the
342		driver. Some drives without compression capability will accept
343		any compression mode without error.
344	MTSETPART Moves the tape to the partition given by the argument at the
345		next tape operation. The block at which the tape is positioned
346		is the block where the tape was previously positioned in the
347		new active partition unless the next tape operation is
348		MTSEEK. In this case the tape is moved directly to the block
349		specified by MTSEEK. MTSETPART is inactive unless
351	MTMKPART Formats the tape with one partition (argument zero) or two
352		partitions (the argument gives in megabytes the size of
353		partition 1 that is physically the first partition of the
354		tape). The drive has to support partitions with size specified
355		by the initiator. Inactive unless MT_ST_CAN_PARTITIONS set.
357		Is used for several purposes. The command is obtained from count
358	        with mask MT_SET_OPTIONS, the low order bits are used as argument.
359		This command is only allowed for the superuser (root). The
360		subcommands are:
361		0
362	           The drive buffer option is set to the argument. Zero means
363	           no buffering.
364	        MT_ST_BOOLEANS
365	           Sets the buffering options. The bits are the new states
366	           (enabled/disabled) the following options (in the
367		   parenthesis is specified whether the option is global or
368		   can be specified differently for each mode):
369		     MT_ST_BUFFER_WRITES write buffering (mode)
370		     MT_ST_ASYNC_WRITES asynchronous writes (mode)
371	             MT_ST_READ_AHEAD  read ahead (mode)
372	             MT_ST_TWO_FM writing of two filemarks (global)
373		     MT_ST_FAST_EOM using the SCSI spacing to EOD (global)
374		     MT_ST_AUTO_LOCK automatic locking of the drive door (global)
375	             MT_ST_DEF_WRITES the defaults are meant only for writes (mode)
376		     MT_ST_CAN_BSR backspacing over more than one records can
377			be used for repositioning the tape (global)
378		     MT_ST_NO_BLKLIMS the driver does not ask the block limits
379			from the drive (block size can be changed only to
380			variable) (global)
381		     MT_ST_CAN_PARTITIONS enables support for partitioned
382			tapes (global)
383		     MT_ST_SCSI2LOGICAL the logical block number is used in
384			the MTSEEK and MTIOCPOS for SCSI-2 drives instead of
385			the device dependent address. It is recommended to set
386			this flag unless there are tapes using the device
387			dependent (from the old times) (global)
388		     MT_ST_SYSV sets the SYSV semantics (mode)
389		     MT_ST_NOWAIT enables immediate mode (i.e., don't wait for
390		        the command to finish) for some commands (e.g., rewind)
391		     MT_ST_NOWAIT_EOF enables immediate filemark mode (i.e. when
392		        writing a filemark, don't wait for it to complete). Please
393			see the BASICS note about MTWEOFI with respect to the
394			possible dangers of writing immediate filemarks.
395		     MT_ST_SILI enables setting the SILI bit in SCSI commands when
396			reading in variable block mode to enhance performance when
397			reading blocks shorter than the byte count; set this only
398			if you are sure that the drive supports SILI and the HBA
399			correctly returns transfer residuals
400		     MT_ST_DEBUGGING debugging (global; debugging must be
401			compiled into the driver)
404		   Sets or clears the option bits.
406	           Sets the write threshold for this device to kilobytes
407	           specified by the lowest bits.
409		   Defines the default block size set automatically. Value
410		   0xffffff means that the default is not used any more.
413		   Used to set or clear the density (8 bits), and drive buffer
414		   state (3 bits). If the value is MT_ST_CLEAR_DEFAULT
415		   (0xfffff) the default will not be used any more. Otherwise
416		   the lowermost bits of the value contain the new value of
417		   the parameter.
419		   The compression default will not be used if the value of
420		   the lowermost byte is 0xff. Otherwise the lowermost bit
421		   contains the new default. If the bits 8-15 are set to a
422		   non-zero number, and this number is not 0xff, the number is
423		   used as the compression algorithm. The value
424		   MT_ST_CLEAR_DEFAULT can be used to clear the compression
425		   default.
427		   Set the normal timeout in seconds for this device. The
428		   default is 900 seconds (15 minutes). The timeout should be
429		   long enough for the retries done by the device while
430		   reading/writing.
432		   Set the long timeout that is used for operations that are
433		   known to take a long time. The default is 14000 seconds
434		   (3.9 hours). For erase this value is further multiplied by
435		   eight.
437		   Set the cleaning request interpretation parameters using
438		   the lowest 24 bits of the argument. The driver can set the
439		   generic status bit GMT_CLN if a cleaning request bit pattern
440		   is found from the extended sense data. Many drives set one or
441		   more bits in the extended sense data when the drive needs
442		   cleaning. The bits are device-dependent. The driver is
443		   given the number of the sense data byte (the lowest eight
444		   bits of the argument; must be >= 18 (values 1 - 17
445		   reserved) and <= the maximum requested sense data sixe), 
446		   a mask to select the relevant bits (the bits 9-16), and the
447		   bit pattern (bits 17-23). If the bit pattern is zero, one
448		   or more bits under the mask indicate cleaning request. If
449		   the pattern is non-zero, the pattern must match the masked
450		   sense data byte.
452		   (The cleaning bit is set if the additional sense code and
453		   qualifier 00h 17h are seen regardless of the setting of
454		   MT_ST_SET_CLN.)
456	The following ioctl uses the structure mtpos:
457	MTIOCPOS Reads the current position from the drive. Uses
458	        Tandberg-compatible QFA for SCSI-1 drives and the SCSI-2
459	        command for the SCSI-2 drives.
461	The following ioctl uses the structure mtget to return the status:
462	MTIOCGET Returns some status information.
463	        The file number and block number within file are returned. The
464	        block is -1 when it can't be determined (e.g., after MTBSF).
465	        The drive type is either MTISSCSI1 or MTISSCSI2.
466	        The number of recovered errors since the previous status call
467	        is stored in the lower word of the field mt_erreg.
468	        The current block size and the density code are stored in the field
469	        mt_dsreg (shifts for the subfields are MT_ST_BLKSIZE_SHIFT and
470	        MT_ST_DENSITY_SHIFT).
471		The GMT_xxx status bits reflect the drive status. GMT_DR_OPEN
472		is set if there is no tape in the drive. GMT_EOD means either
473		end of recorded data or end of tape. GMT_EOT means end of tape.
478	The recovered write errors are considered fatal if ST_RECOVERED_WRITE_FATAL
479	is defined.
481	The maximum number of tape devices is determined by the define
482	ST_MAX_TAPES. If more tapes are detected at driver initialization, the
483	maximum is adjusted accordingly.
485	Immediate return from tape positioning SCSI commands can be enabled by
486	defining ST_NOWAIT. If this is defined, the user should take care that
487	the next tape operation is not started before the previous one has
488	finished. The drives and SCSI adapters should handle this condition
489	gracefully, but some drive/adapter combinations are known to hang the
490	SCSI bus in this case.
492	The MTEOM command is by default implemented as spacing over 32767
493	filemarks. With this method the file number in the status is
494	correct. The user can request using direct spacing to EOD by setting
495	ST_FAST_EOM 1 (or using the MT_ST_OPTIONS ioctl). In this case the file
496	number will be invalid.
498	When using read ahead or buffered writes the position within the file
499	may not be correct after the file is closed (correct position may
500	require backspacing over more than one record). The correct position
501	within file can be obtained if ST_IN_FILE_POS is defined at compile
502	time or the MT_ST_CAN_BSR bit is set for the drive with an ioctl.
503	(The driver always backs over a filemark crossed by read ahead if the
504	user does not request data that far.)
509	Debugging code is now compiled in by default but debugging is turned off
510	with the kernel module parameter debug_flag defaulting to 0.  Debugging
511	can still be switched on and off with an ioctl.  To enable debug at
512	module load time add debug_flag=1 to the module load options, the
513	debugging output is not voluminous.
515	If the tape seems to hang, I would be very interested to hear where
516	the driver is waiting. With the command 'ps -l' you can see the state
517	of the process using the tape. If the state is D, the process is
518	waiting for something. The field WCHAN tells where the driver is
519	waiting. If you have the current System.map in the correct place (in
520	/boot for the procps I use) or have updated /etc/psdatabase (for kmem
521	ps), ps writes the function name in the WCHAN field. If not, you have
522	to look up the function from System.map.
524	Note also that the timeouts are very long compared to most other
525	drivers. This means that the Linux driver may appear hung although the
526	real reason is that the tape firmware has got confused.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.