About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / x86 / protection-keys.txt


Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.

1	Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
2	which is found on Intel's Skylake "Scalable Processor" Server CPUs.
3	It will be avalable in future non-server parts.
4	
5	For anyone wishing to test or use this feature, it is available in
6	Amazon's EC2 C5 instances and is known to work there using an Ubuntu
7	17.04 image.
8	
9	Memory Protection Keys provides a mechanism for enforcing page-based
10	protections, but without requiring modification of the page tables
11	when an application changes protection domains.  It works by
12	dedicating 4 previously ignored bits in each page table entry to a
13	"protection key", giving 16 possible keys.
14	
15	There is also a new user-accessible register (PKRU) with two separate
16	bits (Access Disable and Write Disable) for each key.  Being a CPU
17	register, PKRU is inherently thread-local, potentially giving each
18	thread a different set of protections from every other thread.
19	
20	There are two new instructions (RDPKRU/WRPKRU) for reading and writing
21	to the new register.  The feature is only available in 64-bit mode,
22	even though there is theoretically space in the PAE PTEs.  These
23	permissions are enforced on data access only and have no effect on
24	instruction fetches.
25	
26	=========================== Syscalls ===========================
27	
28	There are 3 system calls which directly interact with pkeys:
29	
30		int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
31		int pkey_free(int pkey);
32		int pkey_mprotect(unsigned long start, size_t len,
33				  unsigned long prot, int pkey);
34	
35	Before a pkey can be used, it must first be allocated with
36	pkey_alloc().  An application calls the WRPKRU instruction
37	directly in order to change access permissions to memory covered
38	with a key.  In this example WRPKRU is wrapped by a C function
39	called pkey_set().
40	
41		int real_prot = PROT_READ|PROT_WRITE;
42		pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
43		ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
44		ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
45		... application runs here
46	
47	Now, if the application needs to update the data at 'ptr', it can
48	gain access, do the update, then remove its write access:
49	
50		pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
51		*ptr = foo; // assign something
52		pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
53	
54	Now when it frees the memory, it will also free the pkey since it
55	is no longer in use:
56	
57		munmap(ptr, PAGE_SIZE);
58		pkey_free(pkey);
59	
60	(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
61	 An example implementation can be found in
62	 tools/testing/selftests/x86/protection_keys.c)
63	
64	=========================== Behavior ===========================
65	
66	The kernel attempts to make protection keys consistent with the
67	behavior of a plain mprotect().  For instance if you do this:
68	
69		mprotect(ptr, size, PROT_NONE);
70		something(ptr);
71	
72	you can expect the same effects with protection keys when doing this:
73	
74		pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
75		pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
76		something(ptr);
77	
78	That should be true whether something() is a direct access to 'ptr'
79	like:
80	
81		*ptr = foo;
82	
83	or when the kernel does the access on the application's behalf like
84	with a read():
85	
86		read(fd, ptr, 1);
87	
88	The kernel will send a SIGSEGV in both cases, but si_code will be set
89	to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
90	the plain mprotect() permissions are violated.
Hide Line Numbers


About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog