About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / mutex-design.txt




Custom Search

Based on kernel version 3.15.4. Page generated on 2014-07-07 09:03 EST.

1	Generic Mutex Subsystem
2	
3	started by Ingo Molnar <mingo@redhat.com>
4	
5	  "Why on earth do we need a new mutex subsystem, and what's wrong
6	   with semaphores?"
7	
8	firstly, there's nothing wrong with semaphores. But if the simpler
9	mutex semantics are sufficient for your code, then there are a couple
10	of advantages of mutexes:
11	
12	 - 'struct mutex' is smaller on most architectures: E.g. on x86,
13	   'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes.
14	   A smaller structure size means less RAM footprint, and better
15	   CPU-cache utilization.
16	
17	 - tighter code. On x86 i get the following .text sizes when
18	   switching all mutex-alike semaphores in the kernel to the mutex
19	   subsystem:
20	
21	        text    data     bss     dec     hex filename
22	     3280380  868188  396860 4545428  455b94 vmlinux-semaphore
23	     3255329  865296  396732 4517357  44eded vmlinux-mutex
24	
25	   that's 25051 bytes of code saved, or a 0.76% win - off the hottest
26	   codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%)
27	   Smaller code means better icache footprint, which is one of the
28	   major optimization goals in the Linux kernel currently.
29	
30	 - the mutex subsystem is slightly faster and has better scalability for
31	   contended workloads. On an 8-way x86 system, running a mutex-based
32	   kernel and testing creat+unlink+close (of separate, per-task files)
33	   in /tmp with 16 parallel tasks, the average number of ops/sec is:
34	
35	    Semaphores:                        Mutexes:
36	
37	    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
38	    8 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
39	    checking VFS performance.          checking VFS performance.
40	    avg loops/sec:      34713          avg loops/sec:      84153
41	    CPU utilization:    63%            CPU utilization:    22%
42	
43	   i.e. in this workload, the mutex based kernel was 2.4 times faster
44	   than the semaphore based kernel, _and_ it also had 2.8 times less CPU
45	   utilization. (In terms of 'ops per CPU cycle', the semaphore kernel
46	   performed 551 ops/sec per 1% of CPU time used, while the mutex kernel
47	   performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times
48	   more efficient.)
49	
50	   the scalability difference is visible even on a 2-way P4 HT box:
51	
52	    Semaphores:                        Mutexes:
53	
54	    $ ./test-mutex V 16 10             $ ./test-mutex V 16 10
55	    4 CPUs, running 16 tasks.          8 CPUs, running 16 tasks.
56	    checking VFS performance.          checking VFS performance.
57	    avg loops/sec:      127659         avg loops/sec:      181082
58	    CPU utilization:    100%           CPU utilization:    34%
59	
60	   (the straight performance advantage of mutexes is 41%, the per-cycle
61	    efficiency of mutexes is 4.1 times better.)
62	
63	 - there are no fastpath tradeoffs, the mutex fastpath is just as tight
64	   as the semaphore fastpath. On x86, the locking fastpath is 2
65	   instructions:
66	
67	    c0377ccb <mutex_lock>:
68	    c0377ccb:       f0 ff 08                lock decl (%eax)
69	    c0377cce:       78 0e                   js     c0377cde <.text..lock.mutex>
70	    c0377cd0:       c3                      ret
71	
72	   the unlocking fastpath is equally tight:
73	
74	    c0377cd1 <mutex_unlock>:
75	    c0377cd1:       f0 ff 00                lock incl (%eax)
76	    c0377cd4:       7e 0f                   jle    c0377ce5 <.text..lock.mutex+0x7>
77	    c0377cd6:       c3                      ret
78	
79	 - 'struct mutex' semantics are well-defined and are enforced if
80	   CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have
81	   virtually no debugging code or instrumentation. The mutex subsystem
82	   checks and enforces the following rules:
83	
84	   * - only one task can hold the mutex at a time
85	   * - only the owner can unlock the mutex
86	   * - multiple unlocks are not permitted
87	   * - recursive locking is not permitted
88	   * - a mutex object must be initialized via the API
89	   * - a mutex object must not be initialized via memset or copying
90	   * - task may not exit with mutex held
91	   * - memory areas where held locks reside must not be freed
92	   * - held mutexes must not be reinitialized
93	   * - mutexes may not be used in hardware or software interrupt
94	   *   contexts such as tasklets and timers
95	
96	   furthermore, there are also convenience features in the debugging
97	   code:
98	
99	   * - uses symbolic names of mutexes, whenever they are printed in debug output
100	   * - point-of-acquire tracking, symbolic lookup of function names
101	   * - list of all locks held in the system, printout of them
102	   * - owner tracking
103	   * - detects self-recursing locks and prints out all relevant info
104	   * - detects multi-task circular deadlocks and prints out all affected
105	   *   locks and tasks (and only those tasks)
106	
107	Disadvantages
108	-------------
109	
110	The stricter mutex API means you cannot use mutexes the same way you
111	can use semaphores: e.g. they cannot be used from an interrupt context,
112	nor can they be unlocked from a different context that which acquired
113	it. [ I'm not aware of any other (e.g. performance) disadvantages from
114	using mutexes at the moment, please let me know if you find any. ]
115	
116	Implementation of mutexes
117	-------------------------
118	
119	'struct mutex' is the new mutex type, defined in include/linux/mutex.h and
120	implemented in kernel/locking/mutex.c. It is a counter-based mutex with a
121	spinlock and a wait-list. The counter has 3 states: 1 for "unlocked", 0 for
122	"locked" and negative numbers (usually -1) for "locked, potential waiters
123	queued".
124	
125	the APIs of 'struct mutex' have been streamlined:
126	
127	 DEFINE_MUTEX(name);
128	
129	 mutex_init(mutex);
130	
131	 void mutex_lock(struct mutex *lock);
132	 int  mutex_lock_interruptible(struct mutex *lock);
133	 int  mutex_trylock(struct mutex *lock);
134	 void mutex_unlock(struct mutex *lock);
135	 int  mutex_is_locked(struct mutex *lock);
136	 void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
137	 int  mutex_lock_interruptible_nested(struct mutex *lock,
138	                                      unsigned int subclass);
139	 int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.