Based on kernel version 2.6.33. Page generated on 2010-02-24 15:36 EST.
1 2 PCI Power Management 3 ~~~~~~~~~~~~~~~~~~~~ 4 5 An overview of the concepts and the related functions in the Linux kernel 6 7 Patrick Mochel <mochel[AT]transmeta[DOT]com> 8 (and others) 9 10 --------------------------------------------------------------------------- 11 12 1. Overview 13 2. How the PCI Subsystem Does Power Management 14 3. PCI Utility Functions 15 4. PCI Device Drivers 16 5. Resources 17 18 1. Overview 19 ~~~~~~~~~~~ 20 21 The PCI Power Management Specification was introduced between the PCI 2.1 and 22 PCI 2.2 Specifications. It a standard interface for controlling various 23 power management operations. 24 25 Implementation of the PCI PM Spec is optional, as are several sub-components of 26 it. If a device supports the PCI PM Spec, the device will have an 8 byte 27 capability field in its PCI configuration space. This field is used to describe 28 and control the standard PCI power management features. 29 30 The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses 31 (B0 - B3). The higher the number, the less power the device consumes. However, 32 the higher the number, the longer the latency is for the device to return to 33 an operational state (D0). 34 35 There are actually two D3 states. When someone talks about D3, they usually 36 mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the 37 device may lose some context). But they may also mean D3cold, which is an 38 ACPI D3 state (power is fully off, all state was discarded); or both. 39 40 Bus power management is not covered in this version of this document. 41 42 Note that all PCI devices support D0 and D3cold by default, regardless of 43 whether or not they implement any of the PCI PM spec. 44 45 The possible state transitions that a device can undergo are: 46 47 +---------------------------+ 48 | Current State | New State | 49 +---------------------------+ 50 | D0 | D1, D2, D3| 51 +---------------------------+ 52 | D1 | D2, D3 | 53 +---------------------------+ 54 | D2 | D3 | 55 +---------------------------+ 56 | D1, D2, D3 | D0 | 57 +---------------------------+ 58 59 Note that when the system is entering a global suspend state, all devices will 60 be placed into D3 and when resuming, all devices will be placed into D0. 61 However, when the system is running, other state transitions are possible. 62 63 2. How The PCI Subsystem Handles Power Management 64 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 65 66 The PCI suspend/resume functionality is accessed indirectly via the Power 67 Management subsystem. At boot, the PCI driver registers a power management 68 callback with that layer. Upon entering a suspend state, the PM layer iterates 69 through all of its registered callbacks. This currently takes place only during 70 APM state transitions. 71 72 Upon going to sleep, the PCI subsystem walks its device tree twice. Both times, 73 it does a depth first walk of the device tree. The first walk saves each of the 74 device's state and checks for devices that will prevent the system from entering 75 a global power state. The next walk then places the devices in a low power 76 state. 77 78 The first walk allows a graceful recovery in the event of a failure, since none 79 of the devices have actually been powered down. 80 81 In both walks, in particular the second, all children of a bridge are touched 82 before the actual bridge itself. This allows the bridge to retain power while 83 its children are being accessed. 84 85 Upon resuming from sleep, just the opposite must be true: all bridges must be 86 powered on and restored before their children are powered on. This is easily 87 accomplished with a breadth-first walk of the PCI device tree. 88 89 90 3. PCI Utility Functions 91 ~~~~~~~~~~~~~~~~~~~~~~~~ 92 93 These are helper functions designed to be called by individual device drivers. 94 Assuming that a device behaves as advertised, these should be applicable in most 95 cases. However, results may vary. 96 97 Note that these functions are never implicitly called for the driver. The driver 98 is always responsible for deciding when and if to call these. 99 100 101 pci_save_state 102 -------------- 103 104 Usage: 105 pci_save_state(struct pci_dev *dev); 106 107 Description: 108 Save first 64 bytes of PCI config space, along with any additional 109 PCI-Express or PCI-X information. 110 111 112 pci_restore_state 113 ----------------- 114 115 Usage: 116 pci_restore_state(struct pci_dev *dev); 117 118 Description: 119 Restore previously saved config space. 120 121 122 pci_set_power_state 123 ------------------- 124 125 Usage: 126 pci_set_power_state(struct pci_dev *dev, pci_power_t state); 127 128 Description: 129 Transition device to low power state using PCI PM Capabilities 130 registers. 131 132 Will fail under one of the following conditions: 133 - If state is less than current state, but not D0 (illegal transition) 134 - Device doesn't support PM Capabilities 135 - Device does not support requested state 136 137 138 pci_enable_wake 139 --------------- 140 141 Usage: 142 pci_enable_wake(struct pci_dev *dev, pci_power_t state, int enable); 143 144 Description: 145 Enable device to generate PME# during low power state using PCI PM 146 Capabilities. 147 148 Checks whether if device supports generating PME# from requested state 149 and fail if it does not, unless enable == 0 (request is to disable wake 150 events, which is implicit if it doesn't even support it in the first 151 place). 152 153 Note that the PMC Register in the device's PM Capabilities has a bitmask 154 of the states it supports generating PME# from. D3hot is bit 3 and 155 D3cold is bit 4. So, while a value of 4 as the state may not seem 156 semantically correct, it is. 157 158 159 4. PCI Device Drivers 160 ~~~~~~~~~~~~~~~~~~~~~ 161 162 These functions are intended for use by individual drivers, and are defined in 163 struct pci_driver: 164 165 int (*suspend) (struct pci_dev *dev, pm_message_t state); 166 int (*resume) (struct pci_dev *dev); 167 168 169 suspend 170 ------- 171 172 Usage: 173 174 if (dev->driver && dev->driver->suspend) 175 dev->driver->suspend(dev,state); 176 177 A driver uses this function to actually transition the device into a low power 178 state. This should include disabling I/O, IRQs, and bus-mastering, as well as 179 physically transitioning the device to a lower power state; it may also include 180 calls to pci_enable_wake(). 181 182 Bus mastering may be disabled by doing: 183 184 pci_disable_device(dev); 185 186 For devices that support the PCI PM Spec, this may be used to set the device's 187 power state to match the suspend() parameter: 188 189 pci_set_power_state(dev,state); 190 191 The driver is also responsible for disabling any other device-specific features 192 (e.g blanking screen, turning off on-card memory, etc). 193 194 The driver should be sure to track the current state of the device, as it may 195 obviate the need for some operations. 196 197 The driver should update the current_state field in its pci_dev structure in 198 this function, except for PM-capable devices when pci_set_power_state is used. 199 200 resume 201 ------ 202 203 Usage: 204 205 if (dev->driver && dev->driver->resume) 206 dev->driver->resume(dev) 207 208 The resume callback may be called from any power state, and is always meant to 209 transition the device to the D0 state. 210 211 The driver is responsible for reenabling any features of the device that had 212 been disabled during previous suspend calls, such as IRQs and bus mastering, 213 as well as calling pci_restore_state(). 214 215 If the device is currently in D3, it may need to be reinitialized in resume(). 216 217 * Some types of devices, like bus controllers, will preserve context in D3hot 218 (using Vcc power). Their drivers will often want to avoid re-initializing 219 them after re-entering D0 (perhaps to avoid resetting downstream devices). 220 221 * Other kinds of devices in D3hot will discard device context as part of a 222 soft reset when re-entering the D0 state. 223 224 * Devices resuming from D3cold always go through a power-on reset. Some 225 device context can also be preserved using Vaux power. 226 227 * Some systems hide D3cold resume paths from drivers. For example, on PCs 228 the resume path for suspend-to-disk often runs BIOS powerup code, which 229 will sometimes re-initialize the device. 230 231 To handle resets during D3 to D0 transitions, it may be convenient to share 232 device initialization code between probe() and resume(). Device parameters 233 can also be saved before the driver suspends into D3, avoiding re-probe. 234 235 If the device supports the PCI PM Spec, it can use this to physically transition 236 the device to D0: 237 238 pci_set_power_state(dev,0); 239 240 Note that if the entire system is transitioning out of a global sleep state, all 241 devices will be placed in the D0 state, so this is not necessary. However, in 242 the event that the device is placed in the D3 state during normal operation, 243 this call is necessary. It is impossible to determine which of the two events is 244 taking place in the driver, so it is always a good idea to make that call. 245 246 The driver should take note of the state that it is resuming from in order to 247 ensure correct (and speedy) operation. 248 249 The driver should update the current_state field in its pci_dev structure in 250 this function, except for PM-capable devices when pci_set_power_state is used. 251 252 253 254 A reference implementation 255 ------------------------- 256 .suspend() 257 { 258 /* driver specific operations */ 259 260 /* Disable IRQ */ 261 free_irq(); 262 /* If using MSI */ 263 pci_disable_msi(); 264 265 pci_save_state(); 266 pci_enable_wake(); 267 /* Disable IO/bus master/irq router */ 268 pci_disable_device(); 269 pci_set_power_state(pci_choose_state()); 270 } 271 272 .resume() 273 { 274 pci_set_power_state(PCI_D0); 275 pci_restore_state(); 276 /* device's irq possibly is changed, driver should take care */ 277 pci_enable_device(); 278 pci_set_master(); 279 280 /* if using MSI, device's vector possibly is changed */ 281 pci_enable_msi(); 282 283 request_irq(); 284 /* driver specific operations; */ 285 } 286 287 This is a typical implementation. Drivers can slightly change the order 288 of the operations in the implementation, ignore some operations or add 289 more driver specific operations in it, but drivers should do something like 290 this on the whole. 291 292 5. Resources 293 ~~~~~~~~~~~~ 294 295 PCI Local Bus Specification 296 PCI Bus Power Management Interface Specification 297 298 http://www.pcisig.com