About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / devicetree / usage-model.txt




Custom Search

Based on kernel version 3.13. Page generated on 2014-01-20 22:02 EST.

1	Linux and the Device Tree
2	-------------------------
3	The Linux usage model for device tree data
4	
5	Author: Grant Likely <grant.likely@secretlab.ca>
6	
7	This article describes how Linux uses the device tree.  An overview of
8	the device tree data format can be found on the device tree usage page
9	at devicetree.org[1].
10	
11	[1] http://devicetree.org/Device_Tree_Usage
12	
13	The "Open Firmware Device Tree", or simply Device Tree (DT), is a data
14	structure and language for describing hardware.  More specifically, it
15	is a description of hardware that is readable by an operating system
16	so that the operating system doesn't need to hard code details of the
17	machine.
18	
19	Structurally, the DT is a tree, or acyclic graph with named nodes, and
20	nodes may have an arbitrary number of named properties encapsulating
21	arbitrary data.  A mechanism also exists to create arbitrary
22	links from one node to another outside of the natural tree structure.
23	
24	Conceptually, a common set of usage conventions, called 'bindings',
25	is defined for how data should appear in the tree to describe typical
26	hardware characteristics including data busses, interrupt lines, GPIO
27	connections, and peripheral devices.
28	
29	As much as possible, hardware is described using existing bindings to
30	maximize use of existing support code, but since property and node
31	names are simply text strings, it is easy to extend existing bindings
32	or create new ones by defining new nodes and properties.  Be wary,
33	however, of creating a new binding without first doing some homework
34	about what already exists.  There are currently two different,
35	incompatible, bindings for i2c busses that came about because the new
36	binding was created without first investigating how i2c devices were
37	already being enumerated in existing systems.
38	
39	1. History
40	----------
41	The DT was originally created by Open Firmware as part of the
42	communication method for passing data from Open Firmware to a client
43	program (like to an operating system).  An operating system used the
44	Device Tree to discover the topology of the hardware at runtime, and
45	thereby support a majority of available hardware without hard coded
46	information (assuming drivers were available for all devices).
47	
48	Since Open Firmware is commonly used on PowerPC and SPARC platforms,
49	the Linux support for those architectures has for a long time used the
50	Device Tree.
51	
52	In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit
53	and 64-bit support, the decision was made to require DT support on all
54	powerpc platforms, regardless of whether or not they used Open
55	Firmware.  To do this, a DT representation called the Flattened Device
56	Tree (FDT) was created which could be passed to the kernel as a binary
57	blob without requiring a real Open Firmware implementation.  U-Boot,
58	kexec, and other bootloaders were modified to support both passing a
59	Device Tree Binary (dtb) and to modify a dtb at boot time.  DT was
60	also added to the PowerPC boot wrapper (arch/powerpc/boot/*) so that
61	a dtb could be wrapped up with the kernel image to support booting
62	existing non-DT aware firmware.
63	
64	Some time later, FDT infrastructure was generalized to be usable by
65	all architectures.  At the time of this writing, 6 mainlined
66	architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1
67	out of mainline (nios) have some level of DT support.
68	
69	2. Data Model
70	-------------
71	If you haven't already read the Device Tree Usage[1] page,
72	then go read it now.  It's okay, I'll wait....
73	
74	2.1 High Level View
75	-------------------
76	The most important thing to understand is that the DT is simply a data
77	structure that describes the hardware.  There is nothing magical about
78	it, and it doesn't magically make all hardware configuration problems
79	go away.  What it does do is provide a language for decoupling the
80	hardware configuration from the board and device driver support in the
81	Linux kernel (or any other operating system for that matter).  Using
82	it allows board and device support to become data driven; to make
83	setup decisions based on data passed into the kernel instead of on
84	per-machine hard coded selections.
85	
86	Ideally, data driven platform setup should result in less code
87	duplication and make it easier to support a wide range of hardware
88	with a single kernel image.
89	
90	Linux uses DT data for three major purposes:
91	1) platform identification,
92	2) runtime configuration, and
93	3) device population.
94	
95	2.2 Platform Identification
96	---------------------------
97	First and foremost, the kernel will use data in the DT to identify the
98	specific machine.  In a perfect world, the specific platform shouldn't
99	matter to the kernel because all platform details would be described
100	perfectly by the device tree in a consistent and reliable manner.
101	Hardware is not perfect though, and so the kernel must identify the
102	machine during early boot so that it has the opportunity to run
103	machine-specific fixups.
104	
105	In the majority of cases, the machine identity is irrelevant, and the
106	kernel will instead select setup code based on the machine's core
107	CPU or SoC.  On ARM for example, setup_arch() in
108	arch/arm/kernel/setup.c will call setup_machine_fdt() in
109	arch/arm/kernel/devtree.c which searches through the machine_desc
110	table and selects the machine_desc which best matches the device tree
111	data.  It determines the best match by looking at the 'compatible'
112	property in the root device tree node, and comparing it with the
113	dt_compat list in struct machine_desc (which is defined in
114	arch/arm/include/asm/mach/arch.h if you're curious).
115	
116	The 'compatible' property contains a sorted list of strings starting
117	with the exact name of the machine, followed by an optional list of
118	boards it is compatible with sorted from most compatible to least.  For
119	example, the root compatible properties for the TI BeagleBoard and its
120	successor, the BeagleBoard xM board might look like, respectively:
121	
122		compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3";
123		compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3";
124	
125	Where "ti,omap3-beagleboard-xm" specifies the exact model, it also
126	claims that it compatible with the OMAP 3450 SoC, and the omap3 family
127	of SoCs in general.  You'll notice that the list is sorted from most
128	specific (exact board) to least specific (SoC family).
129	
130	Astute readers might point out that the Beagle xM could also claim
131	compatibility with the original Beagle board.  However, one should be
132	cautioned about doing so at the board level since there is typically a
133	high level of change from one board to another, even within the same
134	product line, and it is hard to nail down exactly what is meant when one
135	board claims to be compatible with another.  For the top level, it is
136	better to err on the side of caution and not claim one board is
137	compatible with another.  The notable exception would be when one
138	board is a carrier for another, such as a CPU module attached to a
139	carrier board.
140	
141	One more note on compatible values.  Any string used in a compatible
142	property must be documented as to what it indicates.  Add
143	documentation for compatible strings in Documentation/devicetree/bindings.
144	
145	Again on ARM, for each machine_desc, the kernel looks to see if
146	any of the dt_compat list entries appear in the compatible property.
147	If one does, then that machine_desc is a candidate for driving the
148	machine.  After searching the entire table of machine_descs,
149	setup_machine_fdt() returns the 'most compatible' machine_desc based
150	on which entry in the compatible property each machine_desc matches
151	against.  If no matching machine_desc is found, then it returns NULL.
152	
153	The reasoning behind this scheme is the observation that in the majority
154	of cases, a single machine_desc can support a large number of boards
155	if they all use the same SoC, or same family of SoCs.  However,
156	invariably there will be some exceptions where a specific board will
157	require special setup code that is not useful in the generic case.
158	Special cases could be handled by explicitly checking for the
159	troublesome board(s) in generic setup code, but doing so very quickly
160	becomes ugly and/or unmaintainable if it is more than just a couple of
161	cases.
162	
163	Instead, the compatible list allows a generic machine_desc to provide
164	support for a wide common set of boards by specifying "less
165	compatible" values in the dt_compat list.  In the example above,
166	generic board support can claim compatibility with "ti,omap3" or
167	"ti,omap3450".  If a bug was discovered on the original beagleboard
168	that required special workaround code during early boot, then a new
169	machine_desc could be added which implements the workarounds and only
170	matches on "ti,omap3-beagleboard".
171	
172	PowerPC uses a slightly different scheme where it calls the .probe()
173	hook from each machine_desc, and the first one returning TRUE is used.
174	However, this approach does not take into account the priority of the
175	compatible list, and probably should be avoided for new architecture
176	support.
177	
178	2.3 Runtime configuration
179	-------------------------
180	In most cases, a DT will be the sole method of communicating data from
181	firmware to the kernel, so also gets used to pass in runtime and
182	configuration data like the kernel parameters string and the location
183	of an initrd image.
184	
185	Most of this data is contained in the /chosen node, and when booting
186	Linux it will look something like this:
187	
188		chosen {
189			bootargs = "console=ttyS0,115200 loglevel=8";
190			initrd-start = <0xc8000000>;
191			initrd-end = <0xc8200000>;
192		};
193	
194	The bootargs property contains the kernel arguments, and the initrd-*
195	properties define the address and size of an initrd blob.  Note that
196	initrd-end is the first address after the initrd image, so this doesn't
197	match the usual semantic of struct resource.  The chosen node may also
198	optionally contain an arbitrary number of additional properties for
199	platform-specific configuration data.
200	
201	During early boot, the architecture setup code calls of_scan_flat_dt()
202	several times with different helper callbacks to parse device tree
203	data before paging is setup.  The of_scan_flat_dt() code scans through
204	the device tree and uses the helpers to extract information required
205	during early boot.  Typically the early_init_dt_scan_chosen() helper
206	is used to parse the chosen node including kernel parameters,
207	early_init_dt_scan_root() to initialize the DT address space model,
208	and early_init_dt_scan_memory() to determine the size and
209	location of usable RAM.
210	
211	On ARM, the function setup_machine_fdt() is responsible for early
212	scanning of the device tree after selecting the correct machine_desc
213	that supports the board.
214	
215	2.4 Device population
216	---------------------
217	After the board has been identified, and after the early configuration data
218	has been parsed, then kernel initialization can proceed in the normal
219	way.  At some point in this process, unflatten_device_tree() is called
220	to convert the data into a more efficient runtime representation.
221	This is also when machine-specific setup hooks will get called, like
222	the machine_desc .init_early(), .init_irq() and .init_machine() hooks
223	on ARM.  The remainder of this section uses examples from the ARM
224	implementation, but all architectures will do pretty much the same
225	thing when using a DT.
226	
227	As can be guessed by the names, .init_early() is used for any machine-
228	specific setup that needs to be executed early in the boot process,
229	and .init_irq() is used to set up interrupt handling.  Using a DT
230	doesn't materially change the behaviour of either of these functions.
231	If a DT is provided, then both .init_early() and .init_irq() are able
232	to call any of the DT query functions (of_* in include/linux/of*.h) to
233	get additional data about the platform.
234	
235	The most interesting hook in the DT context is .init_machine() which
236	is primarily responsible for populating the Linux device model with
237	data about the platform.  Historically this has been implemented on
238	embedded platforms by defining a set of static clock structures,
239	platform_devices, and other data in the board support .c file, and
240	registering it en-masse in .init_machine().  When DT is used, then
241	instead of hard coding static devices for each platform, the list of
242	devices can be obtained by parsing the DT, and allocating device
243	structures dynamically.
244	
245	The simplest case is when .init_machine() is only responsible for
246	registering a block of platform_devices.  A platform_device is a concept
247	used by Linux for memory or I/O mapped devices which cannot be detected
248	by hardware, and for 'composite' or 'virtual' devices (more on those
249	later).  While there is no 'platform device' terminology for the DT,
250	platform devices roughly correspond to device nodes at the root of the
251	tree and children of simple memory mapped bus nodes.
252	
253	About now is a good time to lay out an example.  Here is part of the
254	device tree for the NVIDIA Tegra board.
255	
256	/{
257		compatible = "nvidia,harmony", "nvidia,tegra20";
258		#address-cells = <1>;
259		#size-cells = <1>;
260		interrupt-parent = <&intc>;
261	
262		chosen { };
263		aliases { };
264	
265		memory {
266			device_type = "memory";
267			reg = <0x00000000 0x40000000>;
268		};
269	
270		soc {
271			compatible = "nvidia,tegra20-soc", "simple-bus";
272			#address-cells = <1>;
273			#size-cells = <1>;
274			ranges;
275	
276			intc: interrupt-controller@50041000 {
277				compatible = "nvidia,tegra20-gic";
278				interrupt-controller;
279				#interrupt-cells = <1>;
280				reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >;
281			};
282	
283			serial@70006300 {
284				compatible = "nvidia,tegra20-uart";
285				reg = <0x70006300 0x100>;
286				interrupts = <122>;
287			};
288	
289			i2s1: i2s@70002800 {
290				compatible = "nvidia,tegra20-i2s";
291				reg = <0x70002800 0x100>;
292				interrupts = <77>;
293				codec = <&wm8903>;
294			};
295	
296			i2c@7000c000 {
297				compatible = "nvidia,tegra20-i2c";
298				#address-cells = <1>;
299				#size-cells = <0>;
300				reg = <0x7000c000 0x100>;
301				interrupts = <70>;
302	
303				wm8903: codec@1a {
304					compatible = "wlf,wm8903";
305					reg = <0x1a>;
306					interrupts = <347>;
307				};
308			};
309		};
310	
311		sound {
312			compatible = "nvidia,harmony-sound";
313			i2s-controller = <&i2s1>;
314			i2s-codec = <&wm8903>;
315		};
316	};
317	
318	At .init_machine() time, Tegra board support code will need to look at
319	this DT and decide which nodes to create platform_devices for.
320	However, looking at the tree, it is not immediately obvious what kind
321	of device each node represents, or even if a node represents a device
322	at all.  The /chosen, /aliases, and /memory nodes are informational
323	nodes that don't describe devices (although arguably memory could be
324	considered a device).  The children of the /soc node are memory mapped
325	devices, but the codec@1a is an i2c device, and the sound node
326	represents not a device, but rather how other devices are connected
327	together to create the audio subsystem.  I know what each device is
328	because I'm familiar with the board design, but how does the kernel
329	know what to do with each node?
330	
331	The trick is that the kernel starts at the root of the tree and looks
332	for nodes that have a 'compatible' property.  First, it is generally
333	assumed that any node with a 'compatible' property represents a device
334	of some kind, and second, it can be assumed that any node at the root
335	of the tree is either directly attached to the processor bus, or is a
336	miscellaneous system device that cannot be described any other way.
337	For each of these nodes, Linux allocates and registers a
338	platform_device, which in turn may get bound to a platform_driver.
339	
340	Why is using a platform_device for these nodes a safe assumption?
341	Well, for the way that Linux models devices, just about all bus_types
342	assume that its devices are children of a bus controller.  For
343	example, each i2c_client is a child of an i2c_master.  Each spi_device
344	is a child of an SPI bus.  Similarly for USB, PCI, MDIO, etc.  The
345	same hierarchy is also found in the DT, where I2C device nodes only
346	ever appear as children of an I2C bus node.  Ditto for SPI, MDIO, USB,
347	etc.  The only devices which do not require a specific type of parent
348	device are platform_devices (and amba_devices, but more on that
349	later), which will happily live at the base of the Linux /sys/devices
350	tree.  Therefore, if a DT node is at the root of the tree, then it
351	really probably is best registered as a platform_device.
352	
353	Linux board support code calls of_platform_populate(NULL, NULL, NULL, NULL)
354	to kick off discovery of devices at the root of the tree.  The
355	parameters are all NULL because when starting from the root of the
356	tree, there is no need to provide a starting node (the first NULL), a
357	parent struct device (the last NULL), and we're not using a match
358	table (yet).  For a board that only needs to register devices,
359	.init_machine() can be completely empty except for the
360	of_platform_populate() call.
361	
362	In the Tegra example, this accounts for the /soc and /sound nodes, but
363	what about the children of the SoC node?  Shouldn't they be registered
364	as platform devices too?  For Linux DT support, the generic behaviour
365	is for child devices to be registered by the parent's device driver at
366	driver .probe() time.  So, an i2c bus device driver will register a
367	i2c_client for each child node, an SPI bus driver will register
368	its spi_device children, and similarly for other bus_types.
369	According to that model, a driver could be written that binds to the
370	SoC node and simply registers platform_devices for each of its
371	children.  The board support code would allocate and register an SoC
372	device, a (theoretical) SoC device driver could bind to the SoC device,
373	and register platform_devices for /soc/interrupt-controller, /soc/serial,
374	/soc/i2s, and /soc/i2c in its .probe() hook.  Easy, right?
375	
376	Actually, it turns out that registering children of some
377	platform_devices as more platform_devices is a common pattern, and the
378	device tree support code reflects that and makes the above example
379	simpler.  The second argument to of_platform_populate() is an
380	of_device_id table, and any node that matches an entry in that table
381	will also get its child nodes registered.  In the Tegra case, the code
382	can look something like this:
383	
384	static void __init harmony_init_machine(void)
385	{
386		/* ... */
387		of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
388	}
389	
390	"simple-bus" is defined in the ePAPR 1.0 specification as a property
391	meaning a simple memory mapped bus, so the of_platform_populate() code
392	could be written to just assume simple-bus compatible nodes will
393	always be traversed.  However, we pass it in as an argument so that
394	board support code can always override the default behaviour.
395	
396	[Need to add discussion of adding i2c/spi/etc child devices]
397	
398	Appendix A: AMBA devices
399	------------------------
400	
401	ARM Primecells are a certain kind of device attached to the ARM AMBA
402	bus which include some support for hardware detection and power
403	management.  In Linux, struct amba_device and the amba_bus_type is
404	used to represent Primecell devices.  However, the fiddly bit is that
405	not all devices on an AMBA bus are Primecells, and for Linux it is
406	typical for both amba_device and platform_device instances to be
407	siblings of the same bus segment.
408	
409	When using the DT, this creates problems for of_platform_populate()
410	because it must decide whether to register each node as either a
411	platform_device or an amba_device.  This unfortunately complicates the
412	device creation model a little bit, but the solution turns out not to
413	be too invasive.  If a node is compatible with "arm,amba-primecell", then
414	of_platform_populate() will register it as an amba_device instead of a
415	platform_device.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.