About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / cgroups / freezer-subsystem.txt




Custom Search

Based on kernel version 3.3. Page generated on 2012-03-23 21:25 EST.

1	The cgroup freezer is useful to batch job management system which start
2	and stop sets of tasks in order to schedule the resources of a machine
3	according to the desires of a system administrator. This sort of program
4	is often used on HPC clusters to schedule access to the cluster as a
5	whole. The cgroup freezer uses cgroups to describe the set of tasks to
6	be started/stopped by the batch job management system. It also provides
7	a means to start and stop the tasks composing the job.
8	
9	The cgroup freezer will also be useful for checkpointing running groups
10	of tasks. The freezer allows the checkpoint code to obtain a consistent
11	image of the tasks by attempting to force the tasks in a cgroup into a
12	quiescent state. Once the tasks are quiescent another task can
13	walk /proc or invoke a kernel interface to gather information about the
14	quiesced tasks. Checkpointed tasks can be restarted later should a
15	recoverable error occur. This also allows the checkpointed tasks to be
16	migrated between nodes in a cluster by copying the gathered information
17	to another node and restarting the tasks there.
18	
19	Sequences of SIGSTOP and SIGCONT are not always sufficient for stopping
20	and resuming tasks in userspace. Both of these signals are observable
21	from within the tasks we wish to freeze. While SIGSTOP cannot be caught,
22	blocked, or ignored it can be seen by waiting or ptracing parent tasks.
23	SIGCONT is especially unsuitable since it can be caught by the task. Any
24	programs designed to watch for SIGSTOP and SIGCONT could be broken by
25	attempting to use SIGSTOP and SIGCONT to stop and resume tasks. We can
26	demonstrate this problem using nested bash shells:
27	
28		$ echo $$
29		16644
30		$ bash
31		$ echo $$
32		16690
33	
34		From a second, unrelated bash shell:
35		$ kill -SIGSTOP 16690
36		$ kill -SIGCONT 16690
37	
38		<at this point 16690 exits and causes 16644 to exit too>
39	
40	This happens because bash can observe both signals and choose how it
41	responds to them.
42	
43	Another example of a program which catches and responds to these
44	signals is gdb. In fact any program designed to use ptrace is likely to
45	have a problem with this method of stopping and resuming tasks.
46	
47	In contrast, the cgroup freezer uses the kernel freezer code to
48	prevent the freeze/unfreeze cycle from becoming visible to the tasks
49	being frozen. This allows the bash example above and gdb to run as
50	expected.
51	
52	The freezer subsystem in the container filesystem defines a file named
53	freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the
54	cgroup. Subsequently writing "THAWED" will unfreeze the tasks in the cgroup.
55	Reading will return the current state.
56	
57	Note freezer.state doesn't exist in root cgroup, which means root cgroup
58	is non-freezable.
59	
60	* Examples of usage :
61	
62	   # mkdir /sys/fs/cgroup/freezer
63	   # mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer
64	   # mkdir /sys/fs/cgroup/freezer/0
65	   # echo $some_pid > /sys/fs/cgroup/freezer/0/tasks
66	
67	to get status of the freezer subsystem :
68	
69	   # cat /sys/fs/cgroup/freezer/0/freezer.state
70	   THAWED
71	
72	to freeze all tasks in the container :
73	
74	   # echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state
75	   # cat /sys/fs/cgroup/freezer/0/freezer.state
76	   FREEZING
77	   # cat /sys/fs/cgroup/freezer/0/freezer.state
78	   FROZEN
79	
80	to unfreeze all tasks in the container :
81	
82	   # echo THAWED > /sys/fs/cgroup/freezer/0/freezer.state
83	   # cat /sys/fs/cgroup/freezer/0/freezer.state
84	   THAWED
85	
86	This is the basic mechanism which should do the right thing for user space task
87	in a simple scenario.
88	
89	It's important to note that freezing can be incomplete. In that case we return
90	EBUSY. This means that some tasks in the cgroup are busy doing something that
91	prevents us from completely freezing the cgroup at this time. After EBUSY,
92	the cgroup will remain partially frozen -- reflected by freezer.state reporting
93	"FREEZING" when read. The state will remain "FREEZING" until one of these
94	things happens:
95	
96		1) Userspace cancels the freezing operation by writing "THAWED" to
97			the freezer.state file
98		2) Userspace retries the freezing operation by writing "FROZEN" to
99			the freezer.state file (writing "FREEZING" is not legal
100			and returns EINVAL)
101		3) The tasks that blocked the cgroup from entering the "FROZEN"
102			state disappear from the cgroup's set of tasks.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.