About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / filesystems / caching / cachefiles.txt




Custom Search

Based on kernel version 3.15.4. Page generated on 2014-07-07 09:02 EST.

1		       ===============================================
2		       CacheFiles: CACHE ON ALREADY MOUNTED FILESYSTEM
3		       ===============================================
4	
5	Contents:
6	
7	 (*) Overview.
8	
9	 (*) Requirements.
10	
11	 (*) Configuration.
12	
13	 (*) Starting the cache.
14	
15	 (*) Things to avoid.
16	
17	 (*) Cache culling.
18	
19	 (*) Cache structure.
20	
21	 (*) Security model and SELinux.
22	
23	 (*) A note on security.
24	
25	 (*) Statistical information.
26	
27	 (*) Debugging.
28	
29	
30	========
31	OVERVIEW
32	========
33	
34	CacheFiles is a caching backend that's meant to use as a cache a directory on
35	an already mounted filesystem of a local type (such as Ext3).
36	
37	CacheFiles uses a userspace daemon to do some of the cache management - such as
38	reaping stale nodes and culling.  This is called cachefilesd and lives in
39	/sbin.
40	
41	The filesystem and data integrity of the cache are only as good as those of the
42	filesystem providing the backing services.  Note that CacheFiles does not
43	attempt to journal anything since the journalling interfaces of the various
44	filesystems are very specific in nature.
45	
46	CacheFiles creates a misc character device - "/dev/cachefiles" - that is used
47	to communication with the daemon.  Only one thing may have this open at once,
48	and whilst it is open, a cache is at least partially in existence.  The daemon
49	opens this and sends commands down it to control the cache.
50	
51	CacheFiles is currently limited to a single cache.
52	
53	CacheFiles attempts to maintain at least a certain percentage of free space on
54	the filesystem, shrinking the cache by culling the objects it contains to make
55	space if necessary - see the "Cache Culling" section.  This means it can be
56	placed on the same medium as a live set of data, and will expand to make use of
57	spare space and automatically contract when the set of data requires more
58	space.
59	
60	
61	============
62	REQUIREMENTS
63	============
64	
65	The use of CacheFiles and its daemon requires the following features to be
66	available in the system and in the cache filesystem:
67	
68		- dnotify.
69	
70		- extended attributes (xattrs).
71	
72		- openat() and friends.
73	
74		- bmap() support on files in the filesystem (FIBMAP ioctl).
75	
76		- The use of bmap() to detect a partial page at the end of the file.
77	
78	It is strongly recommended that the "dir_index" option is enabled on Ext3
79	filesystems being used as a cache.
80	
81	
82	=============
83	CONFIGURATION
84	=============
85	
86	The cache is configured by a script in /etc/cachefilesd.conf.  These commands
87	set up cache ready for use.  The following script commands are available:
88	
89	 (*) brun <N>%
90	 (*) bcull <N>%
91	 (*) bstop <N>%
92	 (*) frun <N>%
93	 (*) fcull <N>%
94	 (*) fstop <N>%
95	
96		Configure the culling limits.  Optional.  See the section on culling
97		The defaults are 7% (run), 5% (cull) and 1% (stop) respectively.
98	
99		The commands beginning with a 'b' are file space (block) limits, those
100		beginning with an 'f' are file count limits.
101	
102	 (*) dir <path>
103	
104		Specify the directory containing the root of the cache.  Mandatory.
105	
106	 (*) tag <name>
107	
108		Specify a tag to FS-Cache to use in distinguishing multiple caches.
109		Optional.  The default is "CacheFiles".
110	
111	 (*) debug <mask>
112	
113		Specify a numeric bitmask to control debugging in the kernel module.
114		Optional.  The default is zero (all off).  The following values can be
115		OR'd into the mask to collect various information:
116	
117			1	Turn on trace of function entry (_enter() macros)
118			2	Turn on trace of function exit (_leave() macros)
119			4	Turn on trace of internal debug points (_debug())
120	
121		This mask can also be set through sysfs, eg:
122	
123			echo 5 >/sys/modules/cachefiles/parameters/debug
124	
125	
126	==================
127	STARTING THE CACHE
128	==================
129	
130	The cache is started by running the daemon.  The daemon opens the cache device,
131	configures the cache and tells it to begin caching.  At that point the cache
132	binds to fscache and the cache becomes live.
133	
134	The daemon is run as follows:
135	
136		/sbin/cachefilesd [-d]* [-s] [-n] [-f <configfile>]
137	
138	The flags are:
139	
140	 (*) -d
141	
142		Increase the debugging level.  This can be specified multiple times and
143		is cumulative with itself.
144	
145	 (*) -s
146	
147		Send messages to stderr instead of syslog.
148	
149	 (*) -n
150	
151		Don't daemonise and go into background.
152	
153	 (*) -f <configfile>
154	
155		Use an alternative configuration file rather than the default one.
156	
157	
158	===============
159	THINGS TO AVOID
160	===============
161	
162	Do not mount other things within the cache as this will cause problems.  The
163	kernel module contains its own very cut-down path walking facility that ignores
164	mountpoints, but the daemon can't avoid them.
165	
166	Do not create, rename or unlink files and directories in the cache whilst the
167	cache is active, as this may cause the state to become uncertain.
168	
169	Renaming files in the cache might make objects appear to be other objects (the
170	filename is part of the lookup key).
171	
172	Do not change or remove the extended attributes attached to cache files by the
173	cache as this will cause the cache state management to get confused.
174	
175	Do not create files or directories in the cache, lest the cache get confused or
176	serve incorrect data.
177	
178	Do not chmod files in the cache.  The module creates things with minimal
179	permissions to prevent random users being able to access them directly.
180	
181	
182	=============
183	CACHE CULLING
184	=============
185	
186	The cache may need culling occasionally to make space.  This involves
187	discarding objects from the cache that have been used less recently than
188	anything else.  Culling is based on the access time of data objects.  Empty
189	directories are culled if not in use.
190	
191	Cache culling is done on the basis of the percentage of blocks and the
192	percentage of files available in the underlying filesystem.  There are six
193	"limits":
194	
195	 (*) brun
196	 (*) frun
197	
198	     If the amount of free space and the number of available files in the cache
199	     rises above both these limits, then culling is turned off.
200	
201	 (*) bcull
202	 (*) fcull
203	
204	     If the amount of available space or the number of available files in the
205	     cache falls below either of these limits, then culling is started.
206	
207	 (*) bstop
208	 (*) fstop
209	
210	     If the amount of available space or the number of available files in the
211	     cache falls below either of these limits, then no further allocation of
212	     disk space or files is permitted until culling has raised things above
213	     these limits again.
214	
215	These must be configured thusly:
216	
217		0 <= bstop < bcull < brun < 100
218		0 <= fstop < fcull < frun < 100
219	
220	Note that these are percentages of available space and available files, and do
221	_not_ appear as 100 minus the percentage displayed by the "df" program.
222	
223	The userspace daemon scans the cache to build up a table of cullable objects.
224	These are then culled in least recently used order.  A new scan of the cache is
225	started as soon as space is made in the table.  Objects will be skipped if
226	their atimes have changed or if the kernel module says it is still using them.
227	
228	
229	===============
230	CACHE STRUCTURE
231	===============
232	
233	The CacheFiles module will create two directories in the directory it was
234	given:
235	
236	 (*) cache/
237	
238	 (*) graveyard/
239	
240	The active cache objects all reside in the first directory.  The CacheFiles
241	kernel module moves any retired or culled objects that it can't simply unlink
242	to the graveyard from which the daemon will actually delete them.
243	
244	The daemon uses dnotify to monitor the graveyard directory, and will delete
245	anything that appears therein.
246	
247	
248	The module represents index objects as directories with the filename "I..." or
249	"J...".  Note that the "cache/" directory is itself a special index.
250	
251	Data objects are represented as files if they have no children, or directories
252	if they do.  Their filenames all begin "D..." or "E...".  If represented as a
253	directory, data objects will have a file in the directory called "data" that
254	actually holds the data.
255	
256	Special objects are similar to data objects, except their filenames begin
257	"S..." or "T...".
258	
259	
260	If an object has children, then it will be represented as a directory.
261	Immediately in the representative directory are a collection of directories
262	named for hash values of the child object keys with an '@' prepended.  Into
263	this directory, if possible, will be placed the representations of the child
264	objects:
265	
266		INDEX     INDEX      INDEX                             DATA FILES
267		========= ========== ================================= ================
268		cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400
269		cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry
270		cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry
271		cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...FP1ry
272	
273	
274	If the key is so long that it exceeds NAME_MAX with the decorations added on to
275	it, then it will be cut into pieces, the first few of which will be used to
276	make a nest of directories, and the last one of which will be the objects
277	inside the last directory.  The names of the intermediate directories will have
278	'+' prepended:
279	
280		J1223/@23/+xy...z/+kl...m/Epqr
281	
282	
283	Note that keys are raw data, and not only may they exceed NAME_MAX in size,
284	they may also contain things like '/' and NUL characters, and so they may not
285	be suitable for turning directly into a filename.
286	
287	To handle this, CacheFiles will use a suitably printable filename directly and
288	"base-64" encode ones that aren't directly suitable.  The two versions of
289	object filenames indicate the encoding:
290	
291		OBJECT TYPE	PRINTABLE	ENCODED
292		===============	===============	===============
293		Index		"I..."		"J..."
294		Data		"D..."		"E..."
295		Special		"S..."		"T..."
296	
297	Intermediate directories are always "@" or "+" as appropriate.
298	
299	
300	Each object in the cache has an extended attribute label that holds the object
301	type ID (required to distinguish special objects) and the auxiliary data from
302	the netfs.  The latter is used to detect stale objects in the cache and update
303	or retire them.
304	
305	
306	Note that CacheFiles will erase from the cache any file it doesn't recognise or
307	any file of an incorrect type (such as a FIFO file or a device file).
308	
309	
310	==========================
311	SECURITY MODEL AND SELINUX
312	==========================
313	
314	CacheFiles is implemented to deal properly with the LSM security features of
315	the Linux kernel and the SELinux facility.
316	
317	One of the problems that CacheFiles faces is that it is generally acting on
318	behalf of a process, and running in that process's context, and that includes a
319	security context that is not appropriate for accessing the cache - either
320	because the files in the cache are inaccessible to that process, or because if
321	the process creates a file in the cache, that file may be inaccessible to other
322	processes.
323	
324	The way CacheFiles works is to temporarily change the security context (fsuid,
325	fsgid and actor security label) that the process acts as - without changing the
326	security context of the process when it the target of an operation performed by
327	some other process (so signalling and suchlike still work correctly).
328	
329	
330	When the CacheFiles module is asked to bind to its cache, it:
331	
332	 (1) Finds the security label attached to the root cache directory and uses
333	     that as the security label with which it will create files.  By default,
334	     this is:
335	
336		cachefiles_var_t
337	
338	 (2) Finds the security label of the process which issued the bind request
339	     (presumed to be the cachefilesd daemon), which by default will be:
340	
341		cachefilesd_t
342	
343	     and asks LSM to supply a security ID as which it should act given the
344	     daemon's label.  By default, this will be:
345	
346		cachefiles_kernel_t
347	
348	     SELinux transitions the daemon's security ID to the module's security ID
349	     based on a rule of this form in the policy.
350	
351		type_transition <daemon's-ID> kernel_t : process <module's-ID>;
352	
353	     For instance:
354	
355		type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t;
356	
357	
358	The module's security ID gives it permission to create, move and remove files
359	and directories in the cache, to find and access directories and files in the
360	cache, to set and access extended attributes on cache objects, and to read and
361	write files in the cache.
362	
363	The daemon's security ID gives it only a very restricted set of permissions: it
364	may scan directories, stat files and erase files and directories.  It may
365	not read or write files in the cache, and so it is precluded from accessing the
366	data cached therein; nor is it permitted to create new files in the cache.
367	
368	
369	There are policy source files available in:
370	
371		http://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2
372	
373	and later versions.  In that tarball, see the files:
374	
375		cachefilesd.te
376		cachefilesd.fc
377		cachefilesd.if
378	
379	They are built and installed directly by the RPM.
380	
381	If a non-RPM based system is being used, then copy the above files to their own
382	directory and run:
383	
384		make -f /usr/share/selinux/devel/Makefile
385		semodule -i cachefilesd.pp
386	
387	You will need checkpolicy and selinux-policy-devel installed prior to the
388	build.
389	
390	
391	By default, the cache is located in /var/fscache, but if it is desirable that
392	it should be elsewhere, than either the above policy files must be altered, or
393	an auxiliary policy must be installed to label the alternate location of the
394	cache.
395	
396	For instructions on how to add an auxiliary policy to enable the cache to be
397	located elsewhere when SELinux is in enforcing mode, please see:
398	
399		/usr/share/doc/cachefilesd-*/move-cache.txt
400	
401	When the cachefilesd rpm is installed; alternatively, the document can be found
402	in the sources.
403	
404	
405	==================
406	A NOTE ON SECURITY
407	==================
408	
409	CacheFiles makes use of the split security in the task_struct.  It allocates
410	its own task_security structure, and redirects current->cred to point to it
411	when it acts on behalf of another process, in that process's context.
412	
413	The reason it does this is that it calls vfs_mkdir() and suchlike rather than
414	bypassing security and calling inode ops directly.  Therefore the VFS and LSM
415	may deny the CacheFiles access to the cache data because under some
416	circumstances the caching code is running in the security context of whatever
417	process issued the original syscall on the netfs.
418	
419	Furthermore, should CacheFiles create a file or directory, the security
420	parameters with that object is created (UID, GID, security label) would be
421	derived from that process that issued the system call, thus potentially
422	preventing other processes from accessing the cache - including CacheFiles's
423	cache management daemon (cachefilesd).
424	
425	What is required is to temporarily override the security of the process that
426	issued the system call.  We can't, however, just do an in-place change of the
427	security data as that affects the process as an object, not just as a subject.
428	This means it may lose signals or ptrace events for example, and affects what
429	the process looks like in /proc.
430	
431	So CacheFiles makes use of a logical split in the security between the
432	objective security (task->real_cred) and the subjective security (task->cred).
433	The objective security holds the intrinsic security properties of a process and
434	is never overridden.  This is what appears in /proc, and is what is used when a
435	process is the target of an operation by some other process (SIGKILL for
436	example).
437	
438	The subjective security holds the active security properties of a process, and
439	may be overridden.  This is not seen externally, and is used whan a process
440	acts upon another object, for example SIGKILLing another process or opening a
441	file.
442	
443	LSM hooks exist that allow SELinux (or Smack or whatever) to reject a request
444	for CacheFiles to run in a context of a specific security label, or to create
445	files and directories with another security label.
446	
447	
448	=======================
449	STATISTICAL INFORMATION
450	=======================
451	
452	If FS-Cache is compiled with the following option enabled:
453	
454		CONFIG_CACHEFILES_HISTOGRAM=y
455	
456	then it will gather certain statistics and display them through a proc file.
457	
458	 (*) /proc/fs/cachefiles/histogram
459	
460		cat /proc/fs/cachefiles/histogram
461		JIFS  SECS  LOOKUPS   MKDIRS    CREATES
462		===== ===== ========= ========= =========
463	
464	     This shows the breakdown of the number of times each amount of time
465	     between 0 jiffies and HZ-1 jiffies a variety of tasks took to run.  The
466	     columns are as follows:
467	
468		COLUMN		TIME MEASUREMENT
469		=======		=======================================================
470		LOOKUPS		Length of time to perform a lookup on the backing fs
471		MKDIRS		Length of time to perform a mkdir on the backing fs
472		CREATES		Length of time to perform a create on the backing fs
473	
474	     Each row shows the number of events that took a particular range of times.
475	     Each step is 1 jiffy in size.  The JIFS column indicates the particular
476	     jiffy range covered, and the SECS field the equivalent number of seconds.
477	
478	
479	=========
480	DEBUGGING
481	=========
482	
483	If CONFIG_CACHEFILES_DEBUG is enabled, the CacheFiles facility can have runtime
484	debugging enabled by adjusting the value in:
485	
486		/sys/module/cachefiles/parameters/debug
487	
488	This is a bitmask of debugging streams to enable:
489	
490		BIT	VALUE	STREAM				POINT
491		=======	=======	===============================	=======================
492		0	1	General				Function entry trace
493		1	2					Function exit trace
494		2	4					General
495	
496	The appropriate set of values should be OR'd together and the result written to
497	the control file.  For example:
498	
499		echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug
500	
501	will turn on all function entry debugging.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.