Based on kernel version 3.9. Page generated on 2013-05-02 23:13 EST.
1 Lockdep-RCU was added to the Linux kernel in early 2010 2 (http://lwn.net/Articles/371986/). This facility checks for some common 3 misuses of the RCU API, most notably using one of the rcu_dereference() 4 family to access an RCU-protected pointer without the proper protection. 5 When such misuse is detected, an lockdep-RCU splat is emitted. 6 7 The usual cause of a lockdep-RCU slat is someone accessing an 8 RCU-protected data structure without either (1) being in the right kind of 9 RCU read-side critical section or (2) holding the right update-side lock. 10 This problem can therefore be serious: it might result in random memory 11 overwriting or worse. There can of course be false positives, this 12 being the real world and all that. 13 14 So let's look at an example RCU lockdep splat from 3.0-rc5, one that 15 has long since been fixed: 16 17 =============================== 18 [ INFO: suspicious RCU usage. ] 19 ------------------------------- 20 block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage! 21 22 other info that might help us debug this: 23 24 25 rcu_scheduler_active = 1, debug_locks = 0 26 3 locks held by scsi_scan_6/1552: 27 #0: (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8145efca>] 28 scsi_scan_host_selected+0x5a/0x150 29 #1: (&eq->sysfs_lock){+.+...}, at: [<ffffffff812a5032>] 30 elevator_exit+0x22/0x60 31 #2: (&(&q->__queue_lock)->rlock){-.-...}, at: [<ffffffff812b6233>] 32 cfq_exit_queue+0x43/0x190 33 34 stack backtrace: 35 Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17 36 Call Trace: 37 [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0 38 [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120 39 [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190 40 [<ffffffff812a5046>] elevator_exit+0x36/0x60 41 [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60 42 [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10 43 [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0 44 [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10 45 [<ffffffff817da069>] ? error_exit+0x29/0xb0 46 [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80 47 [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680 48 [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c 49 [<ffffffff817da069>] ? error_exit+0x29/0xb0 50 [<ffffffff812bcc60>] ? kobject_del+0x40/0x40 51 [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0 52 [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150 53 [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90 54 [<ffffffff8145f170>] do_scan_async+0x20/0x160 55 [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90 56 [<ffffffff810975b6>] kthread+0xa6/0xb0 57 [<ffffffff817db154>] kernel_thread_helper+0x4/0x10 58 [<ffffffff81066430>] ? finish_task_switch+0x80/0x110 59 [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe 60 [<ffffffff81097510>] ? __init_kthread_worker+0x70/0x70 61 [<ffffffff817db150>] ? gs_change+0xb/0xb 62 63 Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows: 64 65 if (rcu_dereference(ioc->ioc_data) == cic) { 66 67 This form says that it must be in a plain vanilla RCU read-side critical 68 section, but the "other info" list above shows that this is not the 69 case. Instead, we hold three locks, one of which might be RCU related. 70 And maybe that lock really does protect this reference. If so, the fix 71 is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to 72 take the struct request_queue "q" from cfq_exit_queue() as an argument, 73 which would permit us to invoke rcu_dereference_protected as follows: 74 75 if (rcu_dereference_protected(ioc->ioc_data, 76 lockdep_is_held(&q->queue_lock)) == cic) { 77 78 With this change, there would be no lockdep-RCU splat emitted if this 79 code was invoked either from within an RCU read-side critical section 80 or with the ->queue_lock held. In particular, this would have suppressed 81 the above lockdep-RCU splat because ->queue_lock is held (see #2 in the 82 list above). 83 84 On the other hand, perhaps we really do need an RCU read-side critical 85 section. In this case, the critical section must span the use of the 86 return value from rcu_dereference(), or at least until there is some 87 reference count incremented or some such. One way to handle this is to 88 add rcu_read_lock() and rcu_read_unlock() as follows: 89 90 rcu_read_lock(); 91 if (rcu_dereference(ioc->ioc_data) == cic) { 92 spin_lock(&ioc->lock); 93 rcu_assign_pointer(ioc->ioc_data, NULL); 94 spin_unlock(&ioc->lock); 95 } 96 rcu_read_unlock(); 97 98 With this change, the rcu_dereference() is always within an RCU 99 read-side critical section, which again would have suppressed the 100 above lockdep-RCU splat. 101 102 But in this particular case, we don't actually deference the pointer 103 returned from rcu_dereference(). Instead, that pointer is just compared 104 to the cic pointer, which means that the rcu_dereference() can be replaced 105 by rcu_access_pointer() as follows: 106 107 if (rcu_access_pointer(ioc->ioc_data) == cic) { 108 109 Because it is legal to invoke rcu_access_pointer() without protection, 110 this change would also suppress the above lockdep-RCU splat.