About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / ia64 / aliasing.txt




Custom Search

Based on kernel version 3.16. Page generated on 2014-08-06 21:39 EST.

1		         MEMORY ATTRIBUTE ALIASING ON IA-64
2	
3				   Bjorn Helgaas
4			       <bjorn.helgaas@hp.com>
5				    May 4, 2006
6	
7	
8	MEMORY ATTRIBUTES
9	
10	    Itanium supports several attributes for virtual memory references.
11	    The attribute is part of the virtual translation, i.e., it is
12	    contained in the TLB entry.  The ones of most interest to the Linux
13	    kernel are:
14	
15		WB		Write-back (cacheable)
16		UC		Uncacheable
17		WC		Write-coalescing
18	
19	    System memory typically uses the WB attribute.  The UC attribute is
20	    used for memory-mapped I/O devices.  The WC attribute is uncacheable
21	    like UC is, but writes may be delayed and combined to increase
22	    performance for things like frame buffers.
23	
24	    The Itanium architecture requires that we avoid accessing the same
25	    page with both a cacheable mapping and an uncacheable mapping[1].
26	
27	    The design of the chipset determines which attributes are supported
28	    on which regions of the address space.  For example, some chipsets
29	    support either WB or UC access to main memory, while others support
30	    only WB access.
31	
32	MEMORY MAP
33	
34	    Platform firmware describes the physical memory map and the
35	    supported attributes for each region.  At boot-time, the kernel uses
36	    the EFI GetMemoryMap() interface.  ACPI can also describe memory
37	    devices and the attributes they support, but Linux/ia64 currently
38	    doesn't use this information.
39	
40	    The kernel uses the efi_memmap table returned from GetMemoryMap() to
41	    learn the attributes supported by each region of physical address
42	    space.  Unfortunately, this table does not completely describe the
43	    address space because some machines omit some or all of the MMIO
44	    regions from the map.
45	
46	    The kernel maintains another table, kern_memmap, which describes the
47	    memory Linux is actually using and the attribute for each region.
48	    This contains only system memory; it does not contain MMIO space.
49	
50	    The kern_memmap table typically contains only a subset of the system
51	    memory described by the efi_memmap.  Linux/ia64 can't use all memory
52	    in the system because of constraints imposed by the identity mapping
53	    scheme.
54	
55	    The efi_memmap table is preserved unmodified because the original
56	    boot-time information is required for kexec.
57	
58	KERNEL IDENTITY MAPPINGS
59	
60	    Linux/ia64 identity mappings are done with large pages, currently
61	    either 16MB or 64MB, referred to as "granules."  Cacheable mappings
62	    are speculative[2], so the processor can read any location in the
63	    page at any time, independent of the programmer's intentions.  This
64	    means that to avoid attribute aliasing, Linux can create a cacheable
65	    identity mapping only when the entire granule supports cacheable
66	    access.
67	
68	    Therefore, kern_memmap contains only full granule-sized regions that
69	    can referenced safely by an identity mapping.
70	
71	    Uncacheable mappings are not speculative, so the processor will
72	    generate UC accesses only to locations explicitly referenced by
73	    software.  This allows UC identity mappings to cover granules that
74	    are only partially populated, or populated with a combination of UC
75	    and WB regions.
76	
77	USER MAPPINGS
78	
79	    User mappings are typically done with 16K or 64K pages.  The smaller
80	    page size allows more flexibility because only 16K or 64K has to be
81	    homogeneous with respect to memory attributes.
82	
83	POTENTIAL ATTRIBUTE ALIASING CASES
84	
85	    There are several ways the kernel creates new mappings:
86	
87	    mmap of /dev/mem
88	
89		This uses remap_pfn_range(), which creates user mappings.  These
90		mappings may be either WB or UC.  If the region being mapped
91		happens to be in kern_memmap, meaning that it may also be mapped
92		by a kernel identity mapping, the user mapping must use the same
93		attribute as the kernel mapping.
94	
95		If the region is not in kern_memmap, the user mapping should use
96		an attribute reported as being supported in the EFI memory map.
97	
98		Since the EFI memory map does not describe MMIO on some
99		machines, this should use an uncacheable mapping as a fallback.
100	
101	    mmap of /sys/class/pci_bus/.../legacy_mem
102	
103		This is very similar to mmap of /dev/mem, except that legacy_mem
104		only allows mmap of the one megabyte "legacy MMIO" area for a
105		specific PCI bus.  Typically this is the first megabyte of
106		physical address space, but it may be different on machines with
107		several VGA devices.
108	
109		"X" uses this to access VGA frame buffers.  Using legacy_mem
110		rather than /dev/mem allows multiple instances of X to talk to
111		different VGA cards.
112	
113		The /dev/mem mmap constraints apply.
114	
115	    mmap of /proc/bus/pci/.../??.?
116	
117	    	This is an MMIO mmap of PCI functions, which additionally may or
118		may not be requested as using the WC attribute.
119	
120		If WC is requested, and the region in kern_memmap is either WC
121		or UC, and the EFI memory map designates the region as WC, then
122		the WC mapping is allowed.
123	
124		Otherwise, the user mapping must use the same attribute as the
125		kernel mapping.
126	
127	    read/write of /dev/mem
128	
129		This uses copy_from_user(), which implicitly uses a kernel
130		identity mapping.  This is obviously safe for things in
131		kern_memmap.
132	
133		There may be corner cases of things that are not in kern_memmap,
134		but could be accessed this way.  For example, registers in MMIO
135		space are not in kern_memmap, but could be accessed with a UC
136		mapping.  This would not cause attribute aliasing.  But
137		registers typically can be accessed only with four-byte or
138		eight-byte accesses, and the copy_from_user() path doesn't allow
139		any control over the access size, so this would be dangerous.
140	
141	    ioremap()
142	
143		This returns a mapping for use inside the kernel.
144	
145		If the region is in kern_memmap, we should use the attribute
146		specified there.
147	
148		If the EFI memory map reports that the entire granule supports
149		WB, we should use that (granules that are partially reserved
150		or occupied by firmware do not appear in kern_memmap).
151	
152		If the granule contains non-WB memory, but we can cover the
153		region safely with kernel page table mappings, we can use
154		ioremap_page_range() as most other architectures do.
155	
156		Failing all of the above, we have to fall back to a UC mapping.
157	
158	PAST PROBLEM CASES
159	
160	    mmap of various MMIO regions from /dev/mem by "X" on Intel platforms
161	
162	      The EFI memory map may not report these MMIO regions.
163	
164	      These must be allowed so that X will work.  This means that
165	      when the EFI memory map is incomplete, every /dev/mem mmap must
166	      succeed.  It may create either WB or UC user mappings, depending
167	      on whether the region is in kern_memmap or the EFI memory map.
168	
169	    mmap of 0x0-0x9FFFF /dev/mem by "hwinfo" on HP sx1000 with VGA enabled
170	
171	      The EFI memory map reports the following attributes:
172	        0x00000-0x9FFFF WB only
173	        0xA0000-0xBFFFF UC only (VGA frame buffer)
174	        0xC0000-0xFFFFF WB only
175	
176	      This mmap is done with user pages, not kernel identity mappings,
177	      so it is safe to use WB mappings.
178	
179	      The kernel VGA driver may ioremap the VGA frame buffer at 0xA0000,
180	      which uses a granule-sized UC mapping.  This granule will cover some
181	      WB-only memory, but since UC is non-speculative, the processor will
182	      never generate an uncacheable reference to the WB-only areas unless
183	      the driver explicitly touches them.
184	
185	    mmap of 0x0-0xFFFFF legacy_mem by "X"
186	
187	      If the EFI memory map reports that the entire range supports the
188	      same attributes, we can allow the mmap (and we will prefer WB if
189	      supported, as is the case with HP sx[12]000 machines with VGA
190	      disabled).
191	
192	      If EFI reports the range as partly WB and partly UC (as on sx[12]000
193	      machines with VGA enabled), we must fail the mmap because there's no
194	      safe attribute to use.
195	
196	      If EFI reports some of the range but not all (as on Intel firmware
197	      that doesn't report the VGA frame buffer at all), we should fail the
198	      mmap and force the user to map just the specific region of interest.
199	
200	    mmap of 0xA0000-0xBFFFF legacy_mem by "X" on HP sx1000 with VGA disabled
201	
202	      The EFI memory map reports the following attributes:
203	        0x00000-0xFFFFF WB only (no VGA MMIO hole)
204	
205	      This is a special case of the previous case, and the mmap should
206	      fail for the same reason as above.
207	
208	    read of /sys/devices/.../rom
209	
210	      For VGA devices, this may cause an ioremap() of 0xC0000.  This
211	      used to be done with a UC mapping, because the VGA frame buffer
212	      at 0xA0000 prevents use of a WB granule.  The UC mapping causes
213	      an MCA on HP sx[12]000 chipsets.
214	
215	      We should use WB page table mappings to avoid covering the VGA
216	      frame buffer.
217	
218	NOTES
219	
220	    [1] SDM rev 2.2, vol 2, sec 4.4.1.
221	    [2] SDM rev 2.2, vol 2, sec 4.4.6.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.