Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.
1 ORC unwinder 2 ============ 3 4 Overview 5 -------- 6 7 The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is 8 similar in concept to a DWARF unwinder. The difference is that the 9 format of the ORC data is much simpler than DWARF, which in turn allows 10 the ORC unwinder to be much simpler and faster. 11 12 The ORC data consists of unwind tables which are generated by objtool. 13 They contain out-of-band data which is used by the in-kernel ORC 14 unwinder. Objtool generates the ORC data by first doing compile-time 15 stack metadata validation (CONFIG_STACK_VALIDATION). After analyzing 16 all the code paths of a .o file, it determines information about the 17 stack state at each instruction address in the file and outputs that 18 information to the .orc_unwind and .orc_unwind_ip sections. 19 20 The per-object ORC sections are combined at link time and are sorted and 21 post-processed at boot time. The unwinder uses the resulting data to 22 correlate instruction addresses with their stack states at run time. 23 24 25 ORC vs frame pointers 26 --------------------- 27 28 With frame pointers enabled, GCC adds instrumentation code to every 29 function in the kernel. The kernel's .text size increases by about 30 3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel 31 Gorman [1] have shown a slowdown of 5-10% for some workloads. 32 33 In contrast, the ORC unwinder has no effect on text size or runtime 34 performance, because the debuginfo is out of band. So if you disable 35 frame pointers and enable the ORC unwinder, you get a nice performance 36 improvement across the board, and still have reliable stack traces. 37 38 Ingo Molnar says: 39 40 "Note that it's not just a performance improvement, but also an 41 instruction cache locality improvement: 3.2% .text savings almost 42 directly transform into a similarly sized reduction in cache 43 footprint. That can transform to even higher speedups for workloads 44 whose cache locality is borderline." 45 46 Another benefit of ORC compared to frame pointers is that it can 47 reliably unwind across interrupts and exceptions. Frame pointer based 48 unwinds can sometimes skip the caller of the interrupted function, if it 49 was a leaf function or if the interrupt hit before the frame pointer was 50 saved. 51 52 The main disadvantage of the ORC unwinder compared to frame pointers is 53 that it needs more memory to store the ORC unwind tables: roughly 2-4MB 54 depending on the kernel config. 55 56 57 ORC vs DWARF 58 ------------ 59 60 ORC debuginfo's advantage over DWARF itself is that it's much simpler. 61 It gets rid of the complex DWARF CFI state machine and also gets rid of 62 the tracking of unnecessary registers. This allows the unwinder to be 63 much simpler, meaning fewer bugs, which is especially important for 64 mission critical oops code. 65 66 The simpler debuginfo format also enables the unwinder to be much faster 67 than DWARF, which is important for perf and lockdep. In a basic 68 performance test by Jiri Slaby [2], the ORC unwinder was about 20x 69 faster than an out-of-tree DWARF unwinder. (Note: That measurement was 70 taken before some performance tweaks were added, which doubled 71 performance, so the speedup over DWARF may be closer to 40x.) 72 73 The ORC data format does have a few downsides compared to DWARF. ORC 74 unwind tables take up ~50% more RAM (+1.3MB on an x86 defconfig kernel) 75 than DWARF-based eh_frame tables. 76 77 Another potential downside is that, as GCC evolves, it's conceivable 78 that the ORC data may end up being *too* simple to describe the state of 79 the stack for certain optimizations. But IMO this is unlikely because 80 GCC saves the frame pointer for any unusual stack adjustments it does, 81 so I suspect we'll really only ever need to keep track of the stack 82 pointer and the frame pointer between call frames. But even if we do 83 end up having to track all the registers DWARF tracks, at least we will 84 still be able to control the format, e.g. no complex state machines. 85 86 87 ORC unwind table generation 88 --------------------------- 89 90 The ORC data is generated by objtool. With the existing compile-time 91 stack metadata validation feature, objtool already follows all code 92 paths, and so it already has all the information it needs to be able to 93 generate ORC data from scratch. So it's an easy step to go from stack 94 validation to ORC data generation. 95 96 It should be possible to instead generate the ORC data with a simple 97 tool which converts DWARF to ORC data. However, such a solution would 98 be incomplete due to the kernel's extensive use of asm, inline asm, and 99 special sections like exception tables. 100 101 That could be rectified by manually annotating those special code paths 102 using GNU assembler .cfi annotations in .S files, and homegrown 103 annotations for inline asm in .c files. But asm annotations were tried 104 in the past and were found to be unmaintainable. They were often 105 incorrect/incomplete and made the code harder to read and keep updated. 106 And based on looking at glibc code, annotating inline asm in .c files 107 might be even worse. 108 109 Objtool still needs a few annotations, but only in code which does 110 unusual things to the stack like entry code. And even then, far fewer 111 annotations are needed than what DWARF would need, so they're much more 112 maintainable than DWARF CFI annotations. 113 114 So the advantages of using objtool to generate ORC data are that it 115 gives more accurate debuginfo, with very few annotations. It also 116 insulates the kernel from toolchain bugs which can be very painful to 117 deal with in the kernel since we often have to workaround issues in 118 older versions of the toolchain for years. 119 120 The downside is that the unwinder now becomes dependent on objtool's 121 ability to reverse engineer GCC code flow. If GCC optimizations become 122 too complicated for objtool to follow, the ORC data generation might 123 stop working or become incomplete. (It's worth noting that livepatch 124 already has such a dependency on objtool's ability to follow GCC code 125 flow.) 126 127 If newer versions of GCC come up with some optimizations which break 128 objtool, we may need to revisit the current implementation. Some 129 possible solutions would be asking GCC to make the optimizations more 130 palatable, or having objtool use DWARF as an additional input, or 131 creating a GCC plugin to assist objtool with its analysis. But for now, 132 objtool follows GCC code quite well. 133 134 135 Unwinder implementation details 136 ------------------------------- 137 138 Objtool generates the ORC data by integrating with the compile-time 139 stack metadata validation feature, which is described in detail in 140 tools/objtool/Documentation/stack-validation.txt. After analyzing all 141 the code paths of a .o file, it creates an array of orc_entry structs, 142 and a parallel array of instruction addresses associated with those 143 structs, and writes them to the .orc_unwind and .orc_unwind_ip sections 144 respectively. 145 146 The ORC data is split into the two arrays for performance reasons, to 147 make the searchable part of the data (.orc_unwind_ip) more compact. The 148 arrays are sorted in parallel at boot time. 149 150 Performance is further improved by the use of a fast lookup table which 151 is created at runtime. The fast lookup table associates a given address 152 with a range of indices for the .orc_unwind table, so that only a small 153 subset of the table needs to be searched. 154 155 156 Etymology 157 --------- 158 159 Orcs, fearsome creatures of medieval folklore, are the Dwarves' natural 160 enemies. Similarly, the ORC unwinder was created in opposition to the 161 complexity and slowness of DWARF. 162 163 "Although Orcs rarely consider multiple solutions to a problem, they do 164 excel at getting things done because they are creatures of action, not 165 thought." [3] Similarly, unlike the esoteric DWARF unwinder, the 166 veracious ORC unwinder wastes no time or siloconic effort decoding 167 variable-length zero-extended unsigned-integer byte-coded 168 state-machine-based debug information entries. 169 170 Similar to how Orcs frequently unravel the well-intentioned plans of 171 their adversaries, the ORC unwinder frequently unravels stacks with 172 brutal, unyielding efficiency. 173 174 ORC stands for Oops Rewind Capability. 175 176 177 [1] https://lkml.kernel.org/r/20170602104048.jkkzssljsompjdwy@suse.de 178 [2] https://lkml.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz 179 [3] http://dustin.wikidot.com/half-orcs-and-orcs