About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / virtual / kvm / api.txt




Custom Search

Based on kernel version 3.9. Page generated on 2013-05-02 23:16 EST.

1	The Definitive KVM (Kernel-based Virtual Machine) API Documentation
2	===================================================================
3	
4	1. General description
5	----------------------
6	
7	The kvm API is a set of ioctls that are issued to control various aspects
8	of a virtual machine.  The ioctls belong to three classes
9	
10	 - System ioctls: These query and set global attributes which affect the
11	   whole kvm subsystem.  In addition a system ioctl is used to create
12	   virtual machines
13	
14	 - VM ioctls: These query and set attributes that affect an entire virtual
15	   machine, for example memory layout.  In addition a VM ioctl is used to
16	   create virtual cpus (vcpus).
17	
18	   Only run VM ioctls from the same process (address space) that was used
19	   to create the VM.
20	
21	 - vcpu ioctls: These query and set attributes that control the operation
22	   of a single virtual cpu.
23	
24	   Only run vcpu ioctls from the same thread that was used to create the
25	   vcpu.
26	
27	
28	2. File descriptors
29	-------------------
30	
31	The kvm API is centered around file descriptors.  An initial
32	open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
33	can be used to issue system ioctls.  A KVM_CREATE_VM ioctl on this
34	handle will create a VM file descriptor which can be used to issue VM
35	ioctls.  A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
36	and return a file descriptor pointing to it.  Finally, ioctls on a vcpu
37	fd can be used to control the vcpu, including the important task of
38	actually running guest code.
39	
40	In general file descriptors can be migrated among processes by means
41	of fork() and the SCM_RIGHTS facility of unix domain socket.  These
42	kinds of tricks are explicitly not supported by kvm.  While they will
43	not cause harm to the host, their actual behavior is not guaranteed by
44	the API.  The only supported use is one virtual machine per process,
45	and one vcpu per thread.
46	
47	
48	3. Extensions
49	-------------
50	
51	As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
52	incompatible change are allowed.  However, there is an extension
53	facility that allows backward-compatible extensions to the API to be
54	queried and used.
55	
56	The extension mechanism is not based on on the Linux version number.
57	Instead, kvm defines extension identifiers and a facility to query
58	whether a particular extension identifier is available.  If it is, a
59	set of ioctls is available for application use.
60	
61	
62	4. API description
63	------------------
64	
65	This section describes ioctls that can be used to control kvm guests.
66	For each ioctl, the following information is provided along with a
67	description:
68	
69	  Capability: which KVM extension provides this ioctl.  Can be 'basic',
70	      which means that is will be provided by any kernel that supports
71	      API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
72	      means availability needs to be checked with KVM_CHECK_EXTENSION
73	      (see section 4.4).
74	
75	  Architectures: which instruction set architectures provide this ioctl.
76	      x86 includes both i386 and x86_64.
77	
78	  Type: system, vm, or vcpu.
79	
80	  Parameters: what parameters are accepted by the ioctl.
81	
82	  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
83	      are not detailed, but errors with specific meanings are.
84	
85	
86	4.1 KVM_GET_API_VERSION
87	
88	Capability: basic
89	Architectures: all
90	Type: system ioctl
91	Parameters: none
92	Returns: the constant KVM_API_VERSION (=12)
93	
94	This identifies the API version as the stable kvm API. It is not
95	expected that this number will change.  However, Linux 2.6.20 and
96	2.6.21 report earlier versions; these are not documented and not
97	supported.  Applications should refuse to run if KVM_GET_API_VERSION
98	returns a value other than 12.  If this check passes, all ioctls
99	described as 'basic' will be available.
100	
101	
102	4.2 KVM_CREATE_VM
103	
104	Capability: basic
105	Architectures: all
106	Type: system ioctl
107	Parameters: machine type identifier (KVM_VM_*)
108	Returns: a VM fd that can be used to control the new virtual machine.
109	
110	The new VM has no virtual cpus and no memory.  An mmap() of a VM fd
111	will access the virtual machine's physical address space; offset zero
112	corresponds to guest physical address zero.  Use of mmap() on a VM fd
113	is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
114	available.
115	You most certainly want to use 0 as machine type.
116	
117	In order to create user controlled virtual machines on S390, check
118	KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as
119	privileged user (CAP_SYS_ADMIN).
120	
121	
122	4.3 KVM_GET_MSR_INDEX_LIST
123	
124	Capability: basic
125	Architectures: x86
126	Type: system
127	Parameters: struct kvm_msr_list (in/out)
128	Returns: 0 on success; -1 on error
129	Errors:
130	  E2BIG:     the msr index list is to be to fit in the array specified by
131	             the user.
132	
133	struct kvm_msr_list {
134		__u32 nmsrs; /* number of msrs in entries */
135		__u32 indices[0];
136	};
137	
138	This ioctl returns the guest msrs that are supported.  The list varies
139	by kvm version and host processor, but does not change otherwise.  The
140	user fills in the size of the indices array in nmsrs, and in return
141	kvm adjusts nmsrs to reflect the actual number of msrs and fills in
142	the indices array with their numbers.
143	
144	Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
145	not returned in the MSR list, as different vcpus can have a different number
146	of banks, as set via the KVM_X86_SETUP_MCE ioctl.
147	
148	
149	4.4 KVM_CHECK_EXTENSION
150	
151	Capability: basic
152	Architectures: all
153	Type: system ioctl
154	Parameters: extension identifier (KVM_CAP_*)
155	Returns: 0 if unsupported; 1 (or some other positive integer) if supported
156	
157	The API allows the application to query about extensions to the core
158	kvm API.  Userspace passes an extension identifier (an integer) and
159	receives an integer that describes the extension availability.
160	Generally 0 means no and 1 means yes, but some extensions may report
161	additional information in the integer return value.
162	
163	
164	4.5 KVM_GET_VCPU_MMAP_SIZE
165	
166	Capability: basic
167	Architectures: all
168	Type: system ioctl
169	Parameters: none
170	Returns: size of vcpu mmap area, in bytes
171	
172	The KVM_RUN ioctl (cf.) communicates with userspace via a shared
173	memory region.  This ioctl returns the size of that region.  See the
174	KVM_RUN documentation for details.
175	
176	
177	4.6 KVM_SET_MEMORY_REGION
178	
179	Capability: basic
180	Architectures: all
181	Type: vm ioctl
182	Parameters: struct kvm_memory_region (in)
183	Returns: 0 on success, -1 on error
184	
185	This ioctl is obsolete and has been removed.
186	
187	
188	4.7 KVM_CREATE_VCPU
189	
190	Capability: basic
191	Architectures: all
192	Type: vm ioctl
193	Parameters: vcpu id (apic id on x86)
194	Returns: vcpu fd on success, -1 on error
195	
196	This API adds a vcpu to a virtual machine.  The vcpu id is a small integer
197	in the range [0, max_vcpus).
198	
199	The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of
200	the KVM_CHECK_EXTENSION ioctl() at run-time.
201	The maximum possible value for max_vcpus can be retrieved using the
202	KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time.
203	
204	If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4
205	cpus max.
206	If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is
207	same as the value returned from KVM_CAP_NR_VCPUS.
208	
209	On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
210	threads in one or more virtual CPU cores.  (This is because the
211	hardware requires all the hardware threads in a CPU core to be in the
212	same partition.)  The KVM_CAP_PPC_SMT capability indicates the number
213	of vcpus per virtual core (vcore).  The vcore id is obtained by
214	dividing the vcpu id by the number of vcpus per vcore.  The vcpus in a
215	given vcore will always be in the same physical core as each other
216	(though that might be a different physical core from time to time).
217	Userspace can control the threading (SMT) mode of the guest by its
218	allocation of vcpu ids.  For example, if userspace wants
219	single-threaded guest vcpus, it should make all vcpu ids be a multiple
220	of the number of vcpus per vcore.
221	
222	For virtual cpus that have been created with S390 user controlled virtual
223	machines, the resulting vcpu fd can be memory mapped at page offset
224	KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
225	cpu's hardware control block.
226	
227	
228	4.8 KVM_GET_DIRTY_LOG (vm ioctl)
229	
230	Capability: basic
231	Architectures: x86
232	Type: vm ioctl
233	Parameters: struct kvm_dirty_log (in/out)
234	Returns: 0 on success, -1 on error
235	
236	/* for KVM_GET_DIRTY_LOG */
237	struct kvm_dirty_log {
238		__u32 slot;
239		__u32 padding;
240		union {
241			void __user *dirty_bitmap; /* one bit per page */
242			__u64 padding;
243		};
244	};
245	
246	Given a memory slot, return a bitmap containing any pages dirtied
247	since the last call to this ioctl.  Bit 0 is the first page in the
248	memory slot.  Ensure the entire structure is cleared to avoid padding
249	issues.
250	
251	
252	4.9 KVM_SET_MEMORY_ALIAS
253	
254	Capability: basic
255	Architectures: x86
256	Type: vm ioctl
257	Parameters: struct kvm_memory_alias (in)
258	Returns: 0 (success), -1 (error)
259	
260	This ioctl is obsolete and has been removed.
261	
262	
263	4.10 KVM_RUN
264	
265	Capability: basic
266	Architectures: all
267	Type: vcpu ioctl
268	Parameters: none
269	Returns: 0 on success, -1 on error
270	Errors:
271	  EINTR:     an unmasked signal is pending
272	
273	This ioctl is used to run a guest virtual cpu.  While there are no
274	explicit parameters, there is an implicit parameter block that can be
275	obtained by mmap()ing the vcpu fd at offset 0, with the size given by
276	KVM_GET_VCPU_MMAP_SIZE.  The parameter block is formatted as a 'struct
277	kvm_run' (see below).
278	
279	
280	4.11 KVM_GET_REGS
281	
282	Capability: basic
283	Architectures: all except ARM
284	Type: vcpu ioctl
285	Parameters: struct kvm_regs (out)
286	Returns: 0 on success, -1 on error
287	
288	Reads the general purpose registers from the vcpu.
289	
290	/* x86 */
291	struct kvm_regs {
292		/* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
293		__u64 rax, rbx, rcx, rdx;
294		__u64 rsi, rdi, rsp, rbp;
295		__u64 r8,  r9,  r10, r11;
296		__u64 r12, r13, r14, r15;
297		__u64 rip, rflags;
298	};
299	
300	
301	4.12 KVM_SET_REGS
302	
303	Capability: basic
304	Architectures: all except ARM
305	Type: vcpu ioctl
306	Parameters: struct kvm_regs (in)
307	Returns: 0 on success, -1 on error
308	
309	Writes the general purpose registers into the vcpu.
310	
311	See KVM_GET_REGS for the data structure.
312	
313	
314	4.13 KVM_GET_SREGS
315	
316	Capability: basic
317	Architectures: x86, ppc
318	Type: vcpu ioctl
319	Parameters: struct kvm_sregs (out)
320	Returns: 0 on success, -1 on error
321	
322	Reads special registers from the vcpu.
323	
324	/* x86 */
325	struct kvm_sregs {
326		struct kvm_segment cs, ds, es, fs, gs, ss;
327		struct kvm_segment tr, ldt;
328		struct kvm_dtable gdt, idt;
329		__u64 cr0, cr2, cr3, cr4, cr8;
330		__u64 efer;
331		__u64 apic_base;
332		__u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
333	};
334	
335	/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
336	
337	interrupt_bitmap is a bitmap of pending external interrupts.  At most
338	one bit may be set.  This interrupt has been acknowledged by the APIC
339	but not yet injected into the cpu core.
340	
341	
342	4.14 KVM_SET_SREGS
343	
344	Capability: basic
345	Architectures: x86, ppc
346	Type: vcpu ioctl
347	Parameters: struct kvm_sregs (in)
348	Returns: 0 on success, -1 on error
349	
350	Writes special registers into the vcpu.  See KVM_GET_SREGS for the
351	data structures.
352	
353	
354	4.15 KVM_TRANSLATE
355	
356	Capability: basic
357	Architectures: x86
358	Type: vcpu ioctl
359	Parameters: struct kvm_translation (in/out)
360	Returns: 0 on success, -1 on error
361	
362	Translates a virtual address according to the vcpu's current address
363	translation mode.
364	
365	struct kvm_translation {
366		/* in */
367		__u64 linear_address;
368	
369		/* out */
370		__u64 physical_address;
371		__u8  valid;
372		__u8  writeable;
373		__u8  usermode;
374		__u8  pad[5];
375	};
376	
377	
378	4.16 KVM_INTERRUPT
379	
380	Capability: basic
381	Architectures: x86, ppc
382	Type: vcpu ioctl
383	Parameters: struct kvm_interrupt (in)
384	Returns: 0 on success, -1 on error
385	
386	Queues a hardware interrupt vector to be injected.  This is only
387	useful if in-kernel local APIC or equivalent is not used.
388	
389	/* for KVM_INTERRUPT */
390	struct kvm_interrupt {
391		/* in */
392		__u32 irq;
393	};
394	
395	X86:
396	
397	Note 'irq' is an interrupt vector, not an interrupt pin or line.
398	
399	PPC:
400	
401	Queues an external interrupt to be injected. This ioctl is overleaded
402	with 3 different irq values:
403	
404	a) KVM_INTERRUPT_SET
405	
406	  This injects an edge type external interrupt into the guest once it's ready
407	  to receive interrupts. When injected, the interrupt is done.
408	
409	b) KVM_INTERRUPT_UNSET
410	
411	  This unsets any pending interrupt.
412	
413	  Only available with KVM_CAP_PPC_UNSET_IRQ.
414	
415	c) KVM_INTERRUPT_SET_LEVEL
416	
417	  This injects a level type external interrupt into the guest context. The
418	  interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
419	  is triggered.
420	
421	  Only available with KVM_CAP_PPC_IRQ_LEVEL.
422	
423	Note that any value for 'irq' other than the ones stated above is invalid
424	and incurs unexpected behavior.
425	
426	
427	4.17 KVM_DEBUG_GUEST
428	
429	Capability: basic
430	Architectures: none
431	Type: vcpu ioctl
432	Parameters: none)
433	Returns: -1 on error
434	
435	Support for this has been removed.  Use KVM_SET_GUEST_DEBUG instead.
436	
437	
438	4.18 KVM_GET_MSRS
439	
440	Capability: basic
441	Architectures: x86
442	Type: vcpu ioctl
443	Parameters: struct kvm_msrs (in/out)
444	Returns: 0 on success, -1 on error
445	
446	Reads model-specific registers from the vcpu.  Supported msr indices can
447	be obtained using KVM_GET_MSR_INDEX_LIST.
448	
449	struct kvm_msrs {
450		__u32 nmsrs; /* number of msrs in entries */
451		__u32 pad;
452	
453		struct kvm_msr_entry entries[0];
454	};
455	
456	struct kvm_msr_entry {
457		__u32 index;
458		__u32 reserved;
459		__u64 data;
460	};
461	
462	Application code should set the 'nmsrs' member (which indicates the
463	size of the entries array) and the 'index' member of each array entry.
464	kvm will fill in the 'data' member.
465	
466	
467	4.19 KVM_SET_MSRS
468	
469	Capability: basic
470	Architectures: x86
471	Type: vcpu ioctl
472	Parameters: struct kvm_msrs (in)
473	Returns: 0 on success, -1 on error
474	
475	Writes model-specific registers to the vcpu.  See KVM_GET_MSRS for the
476	data structures.
477	
478	Application code should set the 'nmsrs' member (which indicates the
479	size of the entries array), and the 'index' and 'data' members of each
480	array entry.
481	
482	
483	4.20 KVM_SET_CPUID
484	
485	Capability: basic
486	Architectures: x86
487	Type: vcpu ioctl
488	Parameters: struct kvm_cpuid (in)
489	Returns: 0 on success, -1 on error
490	
491	Defines the vcpu responses to the cpuid instruction.  Applications
492	should use the KVM_SET_CPUID2 ioctl if available.
493	
494	
495	struct kvm_cpuid_entry {
496		__u32 function;
497		__u32 eax;
498		__u32 ebx;
499		__u32 ecx;
500		__u32 edx;
501		__u32 padding;
502	};
503	
504	/* for KVM_SET_CPUID */
505	struct kvm_cpuid {
506		__u32 nent;
507		__u32 padding;
508		struct kvm_cpuid_entry entries[0];
509	};
510	
511	
512	4.21 KVM_SET_SIGNAL_MASK
513	
514	Capability: basic
515	Architectures: x86
516	Type: vcpu ioctl
517	Parameters: struct kvm_signal_mask (in)
518	Returns: 0 on success, -1 on error
519	
520	Defines which signals are blocked during execution of KVM_RUN.  This
521	signal mask temporarily overrides the threads signal mask.  Any
522	unblocked signal received (except SIGKILL and SIGSTOP, which retain
523	their traditional behaviour) will cause KVM_RUN to return with -EINTR.
524	
525	Note the signal will only be delivered if not blocked by the original
526	signal mask.
527	
528	/* for KVM_SET_SIGNAL_MASK */
529	struct kvm_signal_mask {
530		__u32 len;
531		__u8  sigset[0];
532	};
533	
534	
535	4.22 KVM_GET_FPU
536	
537	Capability: basic
538	Architectures: x86
539	Type: vcpu ioctl
540	Parameters: struct kvm_fpu (out)
541	Returns: 0 on success, -1 on error
542	
543	Reads the floating point state from the vcpu.
544	
545	/* for KVM_GET_FPU and KVM_SET_FPU */
546	struct kvm_fpu {
547		__u8  fpr[8][16];
548		__u16 fcw;
549		__u16 fsw;
550		__u8  ftwx;  /* in fxsave format */
551		__u8  pad1;
552		__u16 last_opcode;
553		__u64 last_ip;
554		__u64 last_dp;
555		__u8  xmm[16][16];
556		__u32 mxcsr;
557		__u32 pad2;
558	};
559	
560	
561	4.23 KVM_SET_FPU
562	
563	Capability: basic
564	Architectures: x86
565	Type: vcpu ioctl
566	Parameters: struct kvm_fpu (in)
567	Returns: 0 on success, -1 on error
568	
569	Writes the floating point state to the vcpu.
570	
571	/* for KVM_GET_FPU and KVM_SET_FPU */
572	struct kvm_fpu {
573		__u8  fpr[8][16];
574		__u16 fcw;
575		__u16 fsw;
576		__u8  ftwx;  /* in fxsave format */
577		__u8  pad1;
578		__u16 last_opcode;
579		__u64 last_ip;
580		__u64 last_dp;
581		__u8  xmm[16][16];
582		__u32 mxcsr;
583		__u32 pad2;
584	};
585	
586	
587	4.24 KVM_CREATE_IRQCHIP
588	
589	Capability: KVM_CAP_IRQCHIP
590	Architectures: x86, ia64, ARM
591	Type: vm ioctl
592	Parameters: none
593	Returns: 0 on success, -1 on error
594	
595	Creates an interrupt controller model in the kernel.  On x86, creates a virtual
596	ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
597	local APIC.  IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
598	only go to the IOAPIC.  On ia64, a IOSAPIC is created. On ARM, a GIC is
599	created.
600	
601	
602	4.25 KVM_IRQ_LINE
603	
604	Capability: KVM_CAP_IRQCHIP
605	Architectures: x86, ia64, arm
606	Type: vm ioctl
607	Parameters: struct kvm_irq_level
608	Returns: 0 on success, -1 on error
609	
610	Sets the level of a GSI input to the interrupt controller model in the kernel.
611	On some architectures it is required that an interrupt controller model has
612	been previously created with KVM_CREATE_IRQCHIP.  Note that edge-triggered
613	interrupts require the level to be set to 1 and then back to 0.
614	
615	ARM can signal an interrupt either at the CPU level, or at the in-kernel irqchip
616	(GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for
617	specific cpus.  The irq field is interpreted like this:
618	
619	  bits:  | 31 ... 24 | 23  ... 16 | 15    ...    0 |
620	  field: | irq_type  | vcpu_index |     irq_id     |
621	
622	The irq_type field has the following values:
623	- irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ
624	- irq_type[1]: in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.)
625	               (the vcpu_index field is ignored)
626	- irq_type[2]: in-kernel GIC: PPI, irq_id between 16 and 31 (incl.)
627	
628	(The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs)
629	
630	In both cases, level is used to raise/lower the line.
631	
632	struct kvm_irq_level {
633		union {
634			__u32 irq;     /* GSI */
635			__s32 status;  /* not used for KVM_IRQ_LEVEL */
636		};
637		__u32 level;           /* 0 or 1 */
638	};
639	
640	
641	4.26 KVM_GET_IRQCHIP
642	
643	Capability: KVM_CAP_IRQCHIP
644	Architectures: x86, ia64
645	Type: vm ioctl
646	Parameters: struct kvm_irqchip (in/out)
647	Returns: 0 on success, -1 on error
648	
649	Reads the state of a kernel interrupt controller created with
650	KVM_CREATE_IRQCHIP into a buffer provided by the caller.
651	
652	struct kvm_irqchip {
653		__u32 chip_id;  /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
654		__u32 pad;
655	        union {
656			char dummy[512];  /* reserving space */
657			struct kvm_pic_state pic;
658			struct kvm_ioapic_state ioapic;
659		} chip;
660	};
661	
662	
663	4.27 KVM_SET_IRQCHIP
664	
665	Capability: KVM_CAP_IRQCHIP
666	Architectures: x86, ia64
667	Type: vm ioctl
668	Parameters: struct kvm_irqchip (in)
669	Returns: 0 on success, -1 on error
670	
671	Sets the state of a kernel interrupt controller created with
672	KVM_CREATE_IRQCHIP from a buffer provided by the caller.
673	
674	struct kvm_irqchip {
675		__u32 chip_id;  /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
676		__u32 pad;
677	        union {
678			char dummy[512];  /* reserving space */
679			struct kvm_pic_state pic;
680			struct kvm_ioapic_state ioapic;
681		} chip;
682	};
683	
684	
685	4.28 KVM_XEN_HVM_CONFIG
686	
687	Capability: KVM_CAP_XEN_HVM
688	Architectures: x86
689	Type: vm ioctl
690	Parameters: struct kvm_xen_hvm_config (in)
691	Returns: 0 on success, -1 on error
692	
693	Sets the MSR that the Xen HVM guest uses to initialize its hypercall
694	page, and provides the starting address and size of the hypercall
695	blobs in userspace.  When the guest writes the MSR, kvm copies one
696	page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
697	memory.
698	
699	struct kvm_xen_hvm_config {
700		__u32 flags;
701		__u32 msr;
702		__u64 blob_addr_32;
703		__u64 blob_addr_64;
704		__u8 blob_size_32;
705		__u8 blob_size_64;
706		__u8 pad2[30];
707	};
708	
709	
710	4.29 KVM_GET_CLOCK
711	
712	Capability: KVM_CAP_ADJUST_CLOCK
713	Architectures: x86
714	Type: vm ioctl
715	Parameters: struct kvm_clock_data (out)
716	Returns: 0 on success, -1 on error
717	
718	Gets the current timestamp of kvmclock as seen by the current guest. In
719	conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
720	such as migration.
721	
722	struct kvm_clock_data {
723		__u64 clock;  /* kvmclock current value */
724		__u32 flags;
725		__u32 pad[9];
726	};
727	
728	
729	4.30 KVM_SET_CLOCK
730	
731	Capability: KVM_CAP_ADJUST_CLOCK
732	Architectures: x86
733	Type: vm ioctl
734	Parameters: struct kvm_clock_data (in)
735	Returns: 0 on success, -1 on error
736	
737	Sets the current timestamp of kvmclock to the value specified in its parameter.
738	In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
739	such as migration.
740	
741	struct kvm_clock_data {
742		__u64 clock;  /* kvmclock current value */
743		__u32 flags;
744		__u32 pad[9];
745	};
746	
747	
748	4.31 KVM_GET_VCPU_EVENTS
749	
750	Capability: KVM_CAP_VCPU_EVENTS
751	Extended by: KVM_CAP_INTR_SHADOW
752	Architectures: x86
753	Type: vm ioctl
754	Parameters: struct kvm_vcpu_event (out)
755	Returns: 0 on success, -1 on error
756	
757	Gets currently pending exceptions, interrupts, and NMIs as well as related
758	states of the vcpu.
759	
760	struct kvm_vcpu_events {
761		struct {
762			__u8 injected;
763			__u8 nr;
764			__u8 has_error_code;
765			__u8 pad;
766			__u32 error_code;
767		} exception;
768		struct {
769			__u8 injected;
770			__u8 nr;
771			__u8 soft;
772			__u8 shadow;
773		} interrupt;
774		struct {
775			__u8 injected;
776			__u8 pending;
777			__u8 masked;
778			__u8 pad;
779		} nmi;
780		__u32 sipi_vector;
781		__u32 flags;
782	};
783	
784	KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
785	interrupt.shadow contains a valid state. Otherwise, this field is undefined.
786	
787	
788	4.32 KVM_SET_VCPU_EVENTS
789	
790	Capability: KVM_CAP_VCPU_EVENTS
791	Extended by: KVM_CAP_INTR_SHADOW
792	Architectures: x86
793	Type: vm ioctl
794	Parameters: struct kvm_vcpu_event (in)
795	Returns: 0 on success, -1 on error
796	
797	Set pending exceptions, interrupts, and NMIs as well as related states of the
798	vcpu.
799	
800	See KVM_GET_VCPU_EVENTS for the data structure.
801	
802	Fields that may be modified asynchronously by running VCPUs can be excluded
803	from the update. These fields are nmi.pending and sipi_vector. Keep the
804	corresponding bits in the flags field cleared to suppress overwriting the
805	current in-kernel state. The bits are:
806	
807	KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
808	KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
809	
810	If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
811	the flags field to signal that interrupt.shadow contains a valid state and
812	shall be written into the VCPU.
813	
814	
815	4.33 KVM_GET_DEBUGREGS
816	
817	Capability: KVM_CAP_DEBUGREGS
818	Architectures: x86
819	Type: vm ioctl
820	Parameters: struct kvm_debugregs (out)
821	Returns: 0 on success, -1 on error
822	
823	Reads debug registers from the vcpu.
824	
825	struct kvm_debugregs {
826		__u64 db[4];
827		__u64 dr6;
828		__u64 dr7;
829		__u64 flags;
830		__u64 reserved[9];
831	};
832	
833	
834	4.34 KVM_SET_DEBUGREGS
835	
836	Capability: KVM_CAP_DEBUGREGS
837	Architectures: x86
838	Type: vm ioctl
839	Parameters: struct kvm_debugregs (in)
840	Returns: 0 on success, -1 on error
841	
842	Writes debug registers into the vcpu.
843	
844	See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
845	yet and must be cleared on entry.
846	
847	
848	4.35 KVM_SET_USER_MEMORY_REGION
849	
850	Capability: KVM_CAP_USER_MEM
851	Architectures: all
852	Type: vm ioctl
853	Parameters: struct kvm_userspace_memory_region (in)
854	Returns: 0 on success, -1 on error
855	
856	struct kvm_userspace_memory_region {
857		__u32 slot;
858		__u32 flags;
859		__u64 guest_phys_addr;
860		__u64 memory_size; /* bytes */
861		__u64 userspace_addr; /* start of the userspace allocated memory */
862	};
863	
864	/* for kvm_memory_region::flags */
865	#define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
866	#define KVM_MEM_READONLY	(1UL << 1)
867	
868	This ioctl allows the user to create or modify a guest physical memory
869	slot.  When changing an existing slot, it may be moved in the guest
870	physical memory space, or its flags may be modified.  It may not be
871	resized.  Slots may not overlap in guest physical address space.
872	
873	Memory for the region is taken starting at the address denoted by the
874	field userspace_addr, which must point at user addressable memory for
875	the entire memory slot size.  Any object may back this memory, including
876	anonymous memory, ordinary files, and hugetlbfs.
877	
878	It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
879	be identical.  This allows large pages in the guest to be backed by large
880	pages in the host.
881	
882	The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
883	KVM_MEM_READONLY.  The former can be set to instruct KVM to keep track of
884	writes to memory within the slot.  See KVM_GET_DIRTY_LOG ioctl to know how to
885	use it.  The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
886	to make a new slot read-only.  In this case, writes to this memory will be
887	posted to userspace as KVM_EXIT_MMIO exits.
888	
889	When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
890	the memory region are automatically reflected into the guest.  For example, an
891	mmap() that affects the region will be made visible immediately.  Another
892	example is madvise(MADV_DROP).
893	
894	It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
895	The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
896	allocation and is deprecated.
897	
898	
899	4.36 KVM_SET_TSS_ADDR
900	
901	Capability: KVM_CAP_SET_TSS_ADDR
902	Architectures: x86
903	Type: vm ioctl
904	Parameters: unsigned long tss_address (in)
905	Returns: 0 on success, -1 on error
906	
907	This ioctl defines the physical address of a three-page region in the guest
908	physical address space.  The region must be within the first 4GB of the
909	guest physical address space and must not conflict with any memory slot
910	or any mmio address.  The guest may malfunction if it accesses this memory
911	region.
912	
913	This ioctl is required on Intel-based hosts.  This is needed on Intel hardware
914	because of a quirk in the virtualization implementation (see the internals
915	documentation when it pops into existence).
916	
917	
918	4.37 KVM_ENABLE_CAP
919	
920	Capability: KVM_CAP_ENABLE_CAP
921	Architectures: ppc, s390
922	Type: vcpu ioctl
923	Parameters: struct kvm_enable_cap (in)
924	Returns: 0 on success; -1 on error
925	
926	+Not all extensions are enabled by default. Using this ioctl the application
927	can enable an extension, making it available to the guest.
928	
929	On systems that do not support this ioctl, it always fails. On systems that
930	do support it, it only works for extensions that are supported for enablement.
931	
932	To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
933	be used.
934	
935	struct kvm_enable_cap {
936	       /* in */
937	       __u32 cap;
938	
939	The capability that is supposed to get enabled.
940	
941	       __u32 flags;
942	
943	A bitfield indicating future enhancements. Has to be 0 for now.
944	
945	       __u64 args[4];
946	
947	Arguments for enabling a feature. If a feature needs initial values to
948	function properly, this is the place to put them.
949	
950	       __u8  pad[64];
951	};
952	
953	
954	4.38 KVM_GET_MP_STATE
955	
956	Capability: KVM_CAP_MP_STATE
957	Architectures: x86, ia64
958	Type: vcpu ioctl
959	Parameters: struct kvm_mp_state (out)
960	Returns: 0 on success; -1 on error
961	
962	struct kvm_mp_state {
963		__u32 mp_state;
964	};
965	
966	Returns the vcpu's current "multiprocessing state" (though also valid on
967	uniprocessor guests).
968	
969	Possible values are:
970	
971	 - KVM_MP_STATE_RUNNABLE:        the vcpu is currently running
972	 - KVM_MP_STATE_UNINITIALIZED:   the vcpu is an application processor (AP)
973	                                 which has not yet received an INIT signal
974	 - KVM_MP_STATE_INIT_RECEIVED:   the vcpu has received an INIT signal, and is
975	                                 now ready for a SIPI
976	 - KVM_MP_STATE_HALTED:          the vcpu has executed a HLT instruction and
977	                                 is waiting for an interrupt
978	 - KVM_MP_STATE_SIPI_RECEIVED:   the vcpu has just received a SIPI (vector
979	                                 accessible via KVM_GET_VCPU_EVENTS)
980	
981	This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
982	irqchip, the multiprocessing state must be maintained by userspace.
983	
984	
985	4.39 KVM_SET_MP_STATE
986	
987	Capability: KVM_CAP_MP_STATE
988	Architectures: x86, ia64
989	Type: vcpu ioctl
990	Parameters: struct kvm_mp_state (in)
991	Returns: 0 on success; -1 on error
992	
993	Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
994	arguments.
995	
996	This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
997	irqchip, the multiprocessing state must be maintained by userspace.
998	
999	
1000	4.40 KVM_SET_IDENTITY_MAP_ADDR
1001	
1002	Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
1003	Architectures: x86
1004	Type: vm ioctl
1005	Parameters: unsigned long identity (in)
1006	Returns: 0 on success, -1 on error
1007	
1008	This ioctl defines the physical address of a one-page region in the guest
1009	physical address space.  The region must be within the first 4GB of the
1010	guest physical address space and must not conflict with any memory slot
1011	or any mmio address.  The guest may malfunction if it accesses this memory
1012	region.
1013	
1014	This ioctl is required on Intel-based hosts.  This is needed on Intel hardware
1015	because of a quirk in the virtualization implementation (see the internals
1016	documentation when it pops into existence).
1017	
1018	
1019	4.41 KVM_SET_BOOT_CPU_ID
1020	
1021	Capability: KVM_CAP_SET_BOOT_CPU_ID
1022	Architectures: x86, ia64
1023	Type: vm ioctl
1024	Parameters: unsigned long vcpu_id
1025	Returns: 0 on success, -1 on error
1026	
1027	Define which vcpu is the Bootstrap Processor (BSP).  Values are the same
1028	as the vcpu id in KVM_CREATE_VCPU.  If this ioctl is not called, the default
1029	is vcpu 0.
1030	
1031	
1032	4.42 KVM_GET_XSAVE
1033	
1034	Capability: KVM_CAP_XSAVE
1035	Architectures: x86
1036	Type: vcpu ioctl
1037	Parameters: struct kvm_xsave (out)
1038	Returns: 0 on success, -1 on error
1039	
1040	struct kvm_xsave {
1041		__u32 region[1024];
1042	};
1043	
1044	This ioctl would copy current vcpu's xsave struct to the userspace.
1045	
1046	
1047	4.43 KVM_SET_XSAVE
1048	
1049	Capability: KVM_CAP_XSAVE
1050	Architectures: x86
1051	Type: vcpu ioctl
1052	Parameters: struct kvm_xsave (in)
1053	Returns: 0 on success, -1 on error
1054	
1055	struct kvm_xsave {
1056		__u32 region[1024];
1057	};
1058	
1059	This ioctl would copy userspace's xsave struct to the kernel.
1060	
1061	
1062	4.44 KVM_GET_XCRS
1063	
1064	Capability: KVM_CAP_XCRS
1065	Architectures: x86
1066	Type: vcpu ioctl
1067	Parameters: struct kvm_xcrs (out)
1068	Returns: 0 on success, -1 on error
1069	
1070	struct kvm_xcr {
1071		__u32 xcr;
1072		__u32 reserved;
1073		__u64 value;
1074	};
1075	
1076	struct kvm_xcrs {
1077		__u32 nr_xcrs;
1078		__u32 flags;
1079		struct kvm_xcr xcrs[KVM_MAX_XCRS];
1080		__u64 padding[16];
1081	};
1082	
1083	This ioctl would copy current vcpu's xcrs to the userspace.
1084	
1085	
1086	4.45 KVM_SET_XCRS
1087	
1088	Capability: KVM_CAP_XCRS
1089	Architectures: x86
1090	Type: vcpu ioctl
1091	Parameters: struct kvm_xcrs (in)
1092	Returns: 0 on success, -1 on error
1093	
1094	struct kvm_xcr {
1095		__u32 xcr;
1096		__u32 reserved;
1097		__u64 value;
1098	};
1099	
1100	struct kvm_xcrs {
1101		__u32 nr_xcrs;
1102		__u32 flags;
1103		struct kvm_xcr xcrs[KVM_MAX_XCRS];
1104		__u64 padding[16];
1105	};
1106	
1107	This ioctl would set vcpu's xcr to the value userspace specified.
1108	
1109	
1110	4.46 KVM_GET_SUPPORTED_CPUID
1111	
1112	Capability: KVM_CAP_EXT_CPUID
1113	Architectures: x86
1114	Type: system ioctl
1115	Parameters: struct kvm_cpuid2 (in/out)
1116	Returns: 0 on success, -1 on error
1117	
1118	struct kvm_cpuid2 {
1119		__u32 nent;
1120		__u32 padding;
1121		struct kvm_cpuid_entry2 entries[0];
1122	};
1123	
1124	#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1
1125	#define KVM_CPUID_FLAG_STATEFUL_FUNC    2
1126	#define KVM_CPUID_FLAG_STATE_READ_NEXT  4
1127	
1128	struct kvm_cpuid_entry2 {
1129		__u32 function;
1130		__u32 index;
1131		__u32 flags;
1132		__u32 eax;
1133		__u32 ebx;
1134		__u32 ecx;
1135		__u32 edx;
1136		__u32 padding[3];
1137	};
1138	
1139	This ioctl returns x86 cpuid features which are supported by both the hardware
1140	and kvm.  Userspace can use the information returned by this ioctl to
1141	construct cpuid information (for KVM_SET_CPUID2) that is consistent with
1142	hardware, kernel, and userspace capabilities, and with user requirements (for
1143	example, the user may wish to constrain cpuid to emulate older hardware,
1144	or for feature consistency across a cluster).
1145	
1146	Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
1147	with the 'nent' field indicating the number of entries in the variable-size
1148	array 'entries'.  If the number of entries is too low to describe the cpu
1149	capabilities, an error (E2BIG) is returned.  If the number is too high,
1150	the 'nent' field is adjusted and an error (ENOMEM) is returned.  If the
1151	number is just right, the 'nent' field is adjusted to the number of valid
1152	entries in the 'entries' array, which is then filled.
1153	
1154	The entries returned are the host cpuid as returned by the cpuid instruction,
1155	with unknown or unsupported features masked out.  Some features (for example,
1156	x2apic), may not be present in the host cpu, but are exposed by kvm if it can
1157	emulate them efficiently. The fields in each entry are defined as follows:
1158	
1159	  function: the eax value used to obtain the entry
1160	  index: the ecx value used to obtain the entry (for entries that are
1161	         affected by ecx)
1162	  flags: an OR of zero or more of the following:
1163	        KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
1164	           if the index field is valid
1165	        KVM_CPUID_FLAG_STATEFUL_FUNC:
1166	           if cpuid for this function returns different values for successive
1167	           invocations; there will be several entries with the same function,
1168	           all with this flag set
1169	        KVM_CPUID_FLAG_STATE_READ_NEXT:
1170	           for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
1171	           the first entry to be read by a cpu
1172	   eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1173	         this function/index combination
1174	
1175	The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
1176	as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
1177	support.  Instead it is reported via
1178	
1179	  ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
1180	
1181	if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
1182	feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
1183	
1184	
1185	4.47 KVM_PPC_GET_PVINFO
1186	
1187	Capability: KVM_CAP_PPC_GET_PVINFO
1188	Architectures: ppc
1189	Type: vm ioctl
1190	Parameters: struct kvm_ppc_pvinfo (out)
1191	Returns: 0 on success, !0 on error
1192	
1193	struct kvm_ppc_pvinfo {
1194		__u32 flags;
1195		__u32 hcall[4];
1196		__u8  pad[108];
1197	};
1198	
1199	This ioctl fetches PV specific information that need to be passed to the guest
1200	using the device tree or other means from vm context.
1201	
1202	The hcall array defines 4 instructions that make up a hypercall.
1203	
1204	If any additional field gets added to this structure later on, a bit for that
1205	additional piece of information will be set in the flags bitmap.
1206	
1207	The flags bitmap is defined as:
1208	
1209	   /* the host supports the ePAPR idle hcall
1210	   #define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
1211	
1212	4.48 KVM_ASSIGN_PCI_DEVICE
1213	
1214	Capability: KVM_CAP_DEVICE_ASSIGNMENT
1215	Architectures: x86 ia64
1216	Type: vm ioctl
1217	Parameters: struct kvm_assigned_pci_dev (in)
1218	Returns: 0 on success, -1 on error
1219	
1220	Assigns a host PCI device to the VM.
1221	
1222	struct kvm_assigned_pci_dev {
1223		__u32 assigned_dev_id;
1224		__u32 busnr;
1225		__u32 devfn;
1226		__u32 flags;
1227		__u32 segnr;
1228		union {
1229			__u32 reserved[11];
1230		};
1231	};
1232	
1233	The PCI device is specified by the triple segnr, busnr, and devfn.
1234	Identification in succeeding service requests is done via assigned_dev_id. The
1235	following flags are specified:
1236	
1237	/* Depends on KVM_CAP_IOMMU */
1238	#define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
1239	/* The following two depend on KVM_CAP_PCI_2_3 */
1240	#define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
1241	#define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
1242	
1243	If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts
1244	via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other
1245	assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
1246	guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
1247	
1248	The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
1249	isolation of the device.  Usages not specifying this flag are deprecated.
1250	
1251	Only PCI header type 0 devices with PCI BAR resources are supported by
1252	device assignment.  The user requesting this ioctl must have read/write
1253	access to the PCI sysfs resource files associated with the device.
1254	
1255	
1256	4.49 KVM_DEASSIGN_PCI_DEVICE
1257	
1258	Capability: KVM_CAP_DEVICE_DEASSIGNMENT
1259	Architectures: x86 ia64
1260	Type: vm ioctl
1261	Parameters: struct kvm_assigned_pci_dev (in)
1262	Returns: 0 on success, -1 on error
1263	
1264	Ends PCI device assignment, releasing all associated resources.
1265	
1266	See KVM_CAP_DEVICE_ASSIGNMENT for the data structure. Only assigned_dev_id is
1267	used in kvm_assigned_pci_dev to identify the device.
1268	
1269	
1270	4.50 KVM_ASSIGN_DEV_IRQ
1271	
1272	Capability: KVM_CAP_ASSIGN_DEV_IRQ
1273	Architectures: x86 ia64
1274	Type: vm ioctl
1275	Parameters: struct kvm_assigned_irq (in)
1276	Returns: 0 on success, -1 on error
1277	
1278	Assigns an IRQ to a passed-through device.
1279	
1280	struct kvm_assigned_irq {
1281		__u32 assigned_dev_id;
1282		__u32 host_irq; /* ignored (legacy field) */
1283		__u32 guest_irq;
1284		__u32 flags;
1285		union {
1286			__u32 reserved[12];
1287		};
1288	};
1289	
1290	The following flags are defined:
1291	
1292	#define KVM_DEV_IRQ_HOST_INTX    (1 << 0)
1293	#define KVM_DEV_IRQ_HOST_MSI     (1 << 1)
1294	#define KVM_DEV_IRQ_HOST_MSIX    (1 << 2)
1295	
1296	#define KVM_DEV_IRQ_GUEST_INTX   (1 << 8)
1297	#define KVM_DEV_IRQ_GUEST_MSI    (1 << 9)
1298	#define KVM_DEV_IRQ_GUEST_MSIX   (1 << 10)
1299	
1300	It is not valid to specify multiple types per host or guest IRQ. However, the
1301	IRQ type of host and guest can differ or can even be null.
1302	
1303	
1304	4.51 KVM_DEASSIGN_DEV_IRQ
1305	
1306	Capability: KVM_CAP_ASSIGN_DEV_IRQ
1307	Architectures: x86 ia64
1308	Type: vm ioctl
1309	Parameters: struct kvm_assigned_irq (in)
1310	Returns: 0 on success, -1 on error
1311	
1312	Ends an IRQ assignment to a passed-through device.
1313	
1314	See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
1315	by assigned_dev_id, flags must correspond to the IRQ type specified on
1316	KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed.
1317	
1318	
1319	4.52 KVM_SET_GSI_ROUTING
1320	
1321	Capability: KVM_CAP_IRQ_ROUTING
1322	Architectures: x86 ia64
1323	Type: vm ioctl
1324	Parameters: struct kvm_irq_routing (in)
1325	Returns: 0 on success, -1 on error
1326	
1327	Sets the GSI routing table entries, overwriting any previously set entries.
1328	
1329	struct kvm_irq_routing {
1330		__u32 nr;
1331		__u32 flags;
1332		struct kvm_irq_routing_entry entries[0];
1333	};
1334	
1335	No flags are specified so far, the corresponding field must be set to zero.
1336	
1337	struct kvm_irq_routing_entry {
1338		__u32 gsi;
1339		__u32 type;
1340		__u32 flags;
1341		__u32 pad;
1342		union {
1343			struct kvm_irq_routing_irqchip irqchip;
1344			struct kvm_irq_routing_msi msi;
1345			__u32 pad[8];
1346		} u;
1347	};
1348	
1349	/* gsi routing entry types */
1350	#define KVM_IRQ_ROUTING_IRQCHIP 1
1351	#define KVM_IRQ_ROUTING_MSI 2
1352	
1353	No flags are specified so far, the corresponding field must be set to zero.
1354	
1355	struct kvm_irq_routing_irqchip {
1356		__u32 irqchip;
1357		__u32 pin;
1358	};
1359	
1360	struct kvm_irq_routing_msi {
1361		__u32 address_lo;
1362		__u32 address_hi;
1363		__u32 data;
1364		__u32 pad;
1365	};
1366	
1367	
1368	4.53 KVM_ASSIGN_SET_MSIX_NR
1369	
1370	Capability: KVM_CAP_DEVICE_MSIX
1371	Architectures: x86 ia64
1372	Type: vm ioctl
1373	Parameters: struct kvm_assigned_msix_nr (in)
1374	Returns: 0 on success, -1 on error
1375	
1376	Set the number of MSI-X interrupts for an assigned device. The number is
1377	reset again by terminating the MSI-X assignment of the device via
1378	KVM_DEASSIGN_DEV_IRQ. Calling this service more than once at any earlier
1379	point will fail.
1380	
1381	struct kvm_assigned_msix_nr {
1382		__u32 assigned_dev_id;
1383		__u16 entry_nr;
1384		__u16 padding;
1385	};
1386	
1387	#define KVM_MAX_MSIX_PER_DEV		256
1388	
1389	
1390	4.54 KVM_ASSIGN_SET_MSIX_ENTRY
1391	
1392	Capability: KVM_CAP_DEVICE_MSIX
1393	Architectures: x86 ia64
1394	Type: vm ioctl
1395	Parameters: struct kvm_assigned_msix_entry (in)
1396	Returns: 0 on success, -1 on error
1397	
1398	Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting
1399	the GSI vector to zero means disabling the interrupt.
1400	
1401	struct kvm_assigned_msix_entry {
1402		__u32 assigned_dev_id;
1403		__u32 gsi;
1404		__u16 entry; /* The index of entry in the MSI-X table */
1405		__u16 padding[3];
1406	};
1407	
1408	
1409	4.55 KVM_SET_TSC_KHZ
1410	
1411	Capability: KVM_CAP_TSC_CONTROL
1412	Architectures: x86
1413	Type: vcpu ioctl
1414	Parameters: virtual tsc_khz
1415	Returns: 0 on success, -1 on error
1416	
1417	Specifies the tsc frequency for the virtual machine. The unit of the
1418	frequency is KHz.
1419	
1420	
1421	4.56 KVM_GET_TSC_KHZ
1422	
1423	Capability: KVM_CAP_GET_TSC_KHZ
1424	Architectures: x86
1425	Type: vcpu ioctl
1426	Parameters: none
1427	Returns: virtual tsc-khz on success, negative value on error
1428	
1429	Returns the tsc frequency of the guest. The unit of the return value is
1430	KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
1431	error.
1432	
1433	
1434	4.57 KVM_GET_LAPIC
1435	
1436	Capability: KVM_CAP_IRQCHIP
1437	Architectures: x86
1438	Type: vcpu ioctl
1439	Parameters: struct kvm_lapic_state (out)
1440	Returns: 0 on success, -1 on error
1441	
1442	#define KVM_APIC_REG_SIZE 0x400
1443	struct kvm_lapic_state {
1444		char regs[KVM_APIC_REG_SIZE];
1445	};
1446	
1447	Reads the Local APIC registers and copies them into the input argument.  The
1448	data format and layout are the same as documented in the architecture manual.
1449	
1450	
1451	4.58 KVM_SET_LAPIC
1452	
1453	Capability: KVM_CAP_IRQCHIP
1454	Architectures: x86
1455	Type: vcpu ioctl
1456	Parameters: struct kvm_lapic_state (in)
1457	Returns: 0 on success, -1 on error
1458	
1459	#define KVM_APIC_REG_SIZE 0x400
1460	struct kvm_lapic_state {
1461		char regs[KVM_APIC_REG_SIZE];
1462	};
1463	
1464	Copies the input argument into the the Local APIC registers.  The data format
1465	and layout are the same as documented in the architecture manual.
1466	
1467	
1468	4.59 KVM_IOEVENTFD
1469	
1470	Capability: KVM_CAP_IOEVENTFD
1471	Architectures: all
1472	Type: vm ioctl
1473	Parameters: struct kvm_ioeventfd (in)
1474	Returns: 0 on success, !0 on error
1475	
1476	This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address
1477	within the guest.  A guest write in the registered address will signal the
1478	provided event instead of triggering an exit.
1479	
1480	struct kvm_ioeventfd {
1481		__u64 datamatch;
1482		__u64 addr;        /* legal pio/mmio address */
1483		__u32 len;         /* 1, 2, 4, or 8 bytes    */
1484		__s32 fd;
1485		__u32 flags;
1486		__u8  pad[36];
1487	};
1488	
1489	The following flags are defined:
1490	
1491	#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
1492	#define KVM_IOEVENTFD_FLAG_PIO       (1 << kvm_ioeventfd_flag_nr_pio)
1493	#define KVM_IOEVENTFD_FLAG_DEASSIGN  (1 << kvm_ioeventfd_flag_nr_deassign)
1494	
1495	If datamatch flag is set, the event will be signaled only if the written value
1496	to the registered address is equal to datamatch in struct kvm_ioeventfd.
1497	
1498	
1499	4.60 KVM_DIRTY_TLB
1500	
1501	Capability: KVM_CAP_SW_TLB
1502	Architectures: ppc
1503	Type: vcpu ioctl
1504	Parameters: struct kvm_dirty_tlb (in)
1505	Returns: 0 on success, -1 on error
1506	
1507	struct kvm_dirty_tlb {
1508		__u64 bitmap;
1509		__u32 num_dirty;
1510	};
1511	
1512	This must be called whenever userspace has changed an entry in the shared
1513	TLB, prior to calling KVM_RUN on the associated vcpu.
1514	
1515	The "bitmap" field is the userspace address of an array.  This array
1516	consists of a number of bits, equal to the total number of TLB entries as
1517	determined by the last successful call to KVM_CONFIG_TLB, rounded up to the
1518	nearest multiple of 64.
1519	
1520	Each bit corresponds to one TLB entry, ordered the same as in the shared TLB
1521	array.
1522	
1523	The array is little-endian: the bit 0 is the least significant bit of the
1524	first byte, bit 8 is the least significant bit of the second byte, etc.
1525	This avoids any complications with differing word sizes.
1526	
1527	The "num_dirty" field is a performance hint for KVM to determine whether it
1528	should skip processing the bitmap and just invalidate everything.  It must
1529	be set to the number of set bits in the bitmap.
1530	
1531	
1532	4.61 KVM_ASSIGN_SET_INTX_MASK
1533	
1534	Capability: KVM_CAP_PCI_2_3
1535	Architectures: x86
1536	Type: vm ioctl
1537	Parameters: struct kvm_assigned_pci_dev (in)
1538	Returns: 0 on success, -1 on error
1539	
1540	Allows userspace to mask PCI INTx interrupts from the assigned device.  The
1541	kernel will not deliver INTx interrupts to the guest between setting and
1542	clearing of KVM_ASSIGN_SET_INTX_MASK via this interface.  This enables use of
1543	and emulation of PCI 2.3 INTx disable command register behavior.
1544	
1545	This may be used for both PCI 2.3 devices supporting INTx disable natively and
1546	older devices lacking this support. Userspace is responsible for emulating the
1547	read value of the INTx disable bit in the guest visible PCI command register.
1548	When modifying the INTx disable state, userspace should precede updating the
1549	physical device command register by calling this ioctl to inform the kernel of
1550	the new intended INTx mask state.
1551	
1552	Note that the kernel uses the device INTx disable bit to internally manage the
1553	device interrupt state for PCI 2.3 devices.  Reads of this register may
1554	therefore not match the expected value.  Writes should always use the guest
1555	intended INTx disable value rather than attempting to read-copy-update the
1556	current physical device state.  Races between user and kernel updates to the
1557	INTx disable bit are handled lazily in the kernel.  It's possible the device
1558	may generate unintended interrupts, but they will not be injected into the
1559	guest.
1560	
1561	See KVM_ASSIGN_DEV_IRQ for the data structure.  The target device is specified
1562	by assigned_dev_id.  In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
1563	evaluated.
1564	
1565	
1566	4.62 KVM_CREATE_SPAPR_TCE
1567	
1568	Capability: KVM_CAP_SPAPR_TCE
1569	Architectures: powerpc
1570	Type: vm ioctl
1571	Parameters: struct kvm_create_spapr_tce (in)
1572	Returns: file descriptor for manipulating the created TCE table
1573	
1574	This creates a virtual TCE (translation control entry) table, which
1575	is an IOMMU for PAPR-style virtual I/O.  It is used to translate
1576	logical addresses used in virtual I/O into guest physical addresses,
1577	and provides a scatter/gather capability for PAPR virtual I/O.
1578	
1579	/* for KVM_CAP_SPAPR_TCE */
1580	struct kvm_create_spapr_tce {
1581		__u64 liobn;
1582		__u32 window_size;
1583	};
1584	
1585	The liobn field gives the logical IO bus number for which to create a
1586	TCE table.  The window_size field specifies the size of the DMA window
1587	which this TCE table will translate - the table will contain one 64
1588	bit TCE entry for every 4kiB of the DMA window.
1589	
1590	When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE
1591	table has been created using this ioctl(), the kernel will handle it
1592	in real mode, updating the TCE table.  H_PUT_TCE calls for other
1593	liobns will cause a vm exit and must be handled by userspace.
1594	
1595	The return value is a file descriptor which can be passed to mmap(2)
1596	to map the created TCE table into userspace.  This lets userspace read
1597	the entries written by kernel-handled H_PUT_TCE calls, and also lets
1598	userspace update the TCE table directly which is useful in some
1599	circumstances.
1600	
1601	
1602	4.63 KVM_ALLOCATE_RMA
1603	
1604	Capability: KVM_CAP_PPC_RMA
1605	Architectures: powerpc
1606	Type: vm ioctl
1607	Parameters: struct kvm_allocate_rma (out)
1608	Returns: file descriptor for mapping the allocated RMA
1609	
1610	This allocates a Real Mode Area (RMA) from the pool allocated at boot
1611	time by the kernel.  An RMA is a physically-contiguous, aligned region
1612	of memory used on older POWER processors to provide the memory which
1613	will be accessed by real-mode (MMU off) accesses in a KVM guest.
1614	POWER processors support a set of sizes for the RMA that usually
1615	includes 64MB, 128MB, 256MB and some larger powers of two.
1616	
1617	/* for KVM_ALLOCATE_RMA */
1618	struct kvm_allocate_rma {
1619		__u64 rma_size;
1620	};
1621	
1622	The return value is a file descriptor which can be passed to mmap(2)
1623	to map the allocated RMA into userspace.  The mapped area can then be
1624	passed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the
1625	RMA for a virtual machine.  The size of the RMA in bytes (which is
1626	fixed at host kernel boot time) is returned in the rma_size field of
1627	the argument structure.
1628	
1629	The KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl
1630	is supported; 2 if the processor requires all virtual machines to have
1631	an RMA, or 1 if the processor can use an RMA but doesn't require it,
1632	because it supports the Virtual RMA (VRMA) facility.
1633	
1634	
1635	4.64 KVM_NMI
1636	
1637	Capability: KVM_CAP_USER_NMI
1638	Architectures: x86
1639	Type: vcpu ioctl
1640	Parameters: none
1641	Returns: 0 on success, -1 on error
1642	
1643	Queues an NMI on the thread's vcpu.  Note this is well defined only
1644	when KVM_CREATE_IRQCHIP has not been called, since this is an interface
1645	between the virtual cpu core and virtual local APIC.  After KVM_CREATE_IRQCHIP
1646	has been called, this interface is completely emulated within the kernel.
1647	
1648	To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the
1649	following algorithm:
1650	
1651	  - pause the vpcu
1652	  - read the local APIC's state (KVM_GET_LAPIC)
1653	  - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1)
1654	  - if so, issue KVM_NMI
1655	  - resume the vcpu
1656	
1657	Some guests configure the LINT1 NMI input to cause a panic, aiding in
1658	debugging.
1659	
1660	
1661	4.65 KVM_S390_UCAS_MAP
1662	
1663	Capability: KVM_CAP_S390_UCONTROL
1664	Architectures: s390
1665	Type: vcpu ioctl
1666	Parameters: struct kvm_s390_ucas_mapping (in)
1667	Returns: 0 in case of success
1668	
1669	The parameter is defined like this:
1670		struct kvm_s390_ucas_mapping {
1671			__u64 user_addr;
1672			__u64 vcpu_addr;
1673			__u64 length;
1674		};
1675	
1676	This ioctl maps the memory at "user_addr" with the length "length" to
1677	the vcpu's address space starting at "vcpu_addr". All parameters need to
1678	be alligned by 1 megabyte.
1679	
1680	
1681	4.66 KVM_S390_UCAS_UNMAP
1682	
1683	Capability: KVM_CAP_S390_UCONTROL
1684	Architectures: s390
1685	Type: vcpu ioctl
1686	Parameters: struct kvm_s390_ucas_mapping (in)
1687	Returns: 0 in case of success
1688	
1689	The parameter is defined like this:
1690		struct kvm_s390_ucas_mapping {
1691			__u64 user_addr;
1692			__u64 vcpu_addr;
1693			__u64 length;
1694		};
1695	
1696	This ioctl unmaps the memory in the vcpu's address space starting at
1697	"vcpu_addr" with the length "length". The field "user_addr" is ignored.
1698	All parameters need to be alligned by 1 megabyte.
1699	
1700	
1701	4.67 KVM_S390_VCPU_FAULT
1702	
1703	Capability: KVM_CAP_S390_UCONTROL
1704	Architectures: s390
1705	Type: vcpu ioctl
1706	Parameters: vcpu absolute address (in)
1707	Returns: 0 in case of success
1708	
1709	This call creates a page table entry on the virtual cpu's address space
1710	(for user controlled virtual machines) or the virtual machine's address
1711	space (for regular virtual machines). This only works for minor faults,
1712	thus it's recommended to access subject memory page via the user page
1713	table upfront. This is useful to handle validity intercepts for user
1714	controlled virtual machines to fault in the virtual cpu's lowcore pages
1715	prior to calling the KVM_RUN ioctl.
1716	
1717	
1718	4.68 KVM_SET_ONE_REG
1719	
1720	Capability: KVM_CAP_ONE_REG
1721	Architectures: all
1722	Type: vcpu ioctl
1723	Parameters: struct kvm_one_reg (in)
1724	Returns: 0 on success, negative value on failure
1725	
1726	struct kvm_one_reg {
1727	       __u64 id;
1728	       __u64 addr;
1729	};
1730	
1731	Using this ioctl, a single vcpu register can be set to a specific value
1732	defined by user space with the passed in struct kvm_one_reg, where id
1733	refers to the register identifier as described below and addr is a pointer
1734	to a variable with the respective size. There can be architecture agnostic
1735	and architecture specific registers. Each have their own range of operation
1736	and their own constants and width. To keep track of the implemented
1737	registers, find a list below:
1738	
1739	  Arch  |       Register        | Width (bits)
1740	        |                       |
1741	  PPC   | KVM_REG_PPC_HIOR      | 64
1742	  PPC   | KVM_REG_PPC_IAC1      | 64
1743	  PPC   | KVM_REG_PPC_IAC2      | 64
1744	  PPC   | KVM_REG_PPC_IAC3      | 64
1745	  PPC   | KVM_REG_PPC_IAC4      | 64
1746	  PPC   | KVM_REG_PPC_DAC1      | 64
1747	  PPC   | KVM_REG_PPC_DAC2      | 64
1748	  PPC   | KVM_REG_PPC_DABR      | 64
1749	  PPC   | KVM_REG_PPC_DSCR      | 64
1750	  PPC   | KVM_REG_PPC_PURR      | 64
1751	  PPC   | KVM_REG_PPC_SPURR     | 64
1752	  PPC   | KVM_REG_PPC_DAR       | 64
1753	  PPC   | KVM_REG_PPC_DSISR     | 32
1754	  PPC   | KVM_REG_PPC_AMR       | 64
1755	  PPC   | KVM_REG_PPC_UAMOR     | 64
1756	  PPC   | KVM_REG_PPC_MMCR0     | 64
1757	  PPC   | KVM_REG_PPC_MMCR1     | 64
1758	  PPC   | KVM_REG_PPC_MMCRA     | 64
1759	  PPC   | KVM_REG_PPC_PMC1      | 32
1760	  PPC   | KVM_REG_PPC_PMC2      | 32
1761	  PPC   | KVM_REG_PPC_PMC3      | 32
1762	  PPC   | KVM_REG_PPC_PMC4      | 32
1763	  PPC   | KVM_REG_PPC_PMC5      | 32
1764	  PPC   | KVM_REG_PPC_PMC6      | 32
1765	  PPC   | KVM_REG_PPC_PMC7      | 32
1766	  PPC   | KVM_REG_PPC_PMC8      | 32
1767	  PPC   | KVM_REG_PPC_FPR0      | 64
1768	          ...
1769	  PPC   | KVM_REG_PPC_FPR31     | 64
1770	  PPC   | KVM_REG_PPC_VR0       | 128
1771	          ...
1772	  PPC   | KVM_REG_PPC_VR31      | 128
1773	  PPC   | KVM_REG_PPC_VSR0      | 128
1774	          ...
1775	  PPC   | KVM_REG_PPC_VSR31     | 128
1776	  PPC   | KVM_REG_PPC_FPSCR     | 64
1777	  PPC   | KVM_REG_PPC_VSCR      | 32
1778	  PPC   | KVM_REG_PPC_VPA_ADDR  | 64
1779	  PPC   | KVM_REG_PPC_VPA_SLB   | 128
1780	  PPC   | KVM_REG_PPC_VPA_DTL   | 128
1781	  PPC   | KVM_REG_PPC_EPCR	| 32
1782	  PPC   | KVM_REG_PPC_EPR	| 32
1783	
1784	ARM registers are mapped using the lower 32 bits.  The upper 16 of that
1785	is the register group type, or coprocessor number:
1786	
1787	ARM core registers have the following id bit patterns:
1788	  0x4002 0000 0010 <index into the kvm_regs struct:16>
1789	
1790	ARM 32-bit CP15 registers have the following id bit patterns:
1791	  0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
1792	
1793	ARM 64-bit CP15 registers have the following id bit patterns:
1794	  0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
1795	
1796	ARM CCSIDR registers are demultiplexed by CSSELR value:
1797	  0x4002 0000 0011 00 <csselr:8>
1798	
1799	ARM 32-bit VFP control registers have the following id bit patterns:
1800	  0x4002 0000 0012 1 <regno:12>
1801	
1802	ARM 64-bit FP registers have the following id bit patterns:
1803	  0x4002 0000 0012 0 <regno:12>
1804	
1805	4.69 KVM_GET_ONE_REG
1806	
1807	Capability: KVM_CAP_ONE_REG
1808	Architectures: all
1809	Type: vcpu ioctl
1810	Parameters: struct kvm_one_reg (in and out)
1811	Returns: 0 on success, negative value on failure
1812	
1813	This ioctl allows to receive the value of a single register implemented
1814	in a vcpu. The register to read is indicated by the "id" field of the
1815	kvm_one_reg struct passed in. On success, the register value can be found
1816	at the memory location pointed to by "addr".
1817	
1818	The list of registers accessible using this interface is identical to the
1819	list in 4.68.
1820	
1821	
1822	4.70 KVM_KVMCLOCK_CTRL
1823	
1824	Capability: KVM_CAP_KVMCLOCK_CTRL
1825	Architectures: Any that implement pvclocks (currently x86 only)
1826	Type: vcpu ioctl
1827	Parameters: None
1828	Returns: 0 on success, -1 on error
1829	
1830	This signals to the host kernel that the specified guest is being paused by
1831	userspace.  The host will set a flag in the pvclock structure that is checked
1832	from the soft lockup watchdog.  The flag is part of the pvclock structure that
1833	is shared between guest and host, specifically the second bit of the flags
1834	field of the pvclock_vcpu_time_info structure.  It will be set exclusively by
1835	the host and read/cleared exclusively by the guest.  The guest operation of
1836	checking and clearing the flag must an atomic operation so
1837	load-link/store-conditional, or equivalent must be used.  There are two cases
1838	where the guest will clear the flag: when the soft lockup watchdog timer resets
1839	itself or when a soft lockup is detected.  This ioctl can be called any time
1840	after pausing the vcpu, but before it is resumed.
1841	
1842	
1843	4.71 KVM_SIGNAL_MSI
1844	
1845	Capability: KVM_CAP_SIGNAL_MSI
1846	Architectures: x86
1847	Type: vm ioctl
1848	Parameters: struct kvm_msi (in)
1849	Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
1850	
1851	Directly inject a MSI message. Only valid with in-kernel irqchip that handles
1852	MSI messages.
1853	
1854	struct kvm_msi {
1855		__u32 address_lo;
1856		__u32 address_hi;
1857		__u32 data;
1858		__u32 flags;
1859		__u8  pad[16];
1860	};
1861	
1862	No flags are defined so far. The corresponding field must be 0.
1863	
1864	
1865	4.71 KVM_CREATE_PIT2
1866	
1867	Capability: KVM_CAP_PIT2
1868	Architectures: x86
1869	Type: vm ioctl
1870	Parameters: struct kvm_pit_config (in)
1871	Returns: 0 on success, -1 on error
1872	
1873	Creates an in-kernel device model for the i8254 PIT. This call is only valid
1874	after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
1875	parameters have to be passed:
1876	
1877	struct kvm_pit_config {
1878		__u32 flags;
1879		__u32 pad[15];
1880	};
1881	
1882	Valid flags are:
1883	
1884	#define KVM_PIT_SPEAKER_DUMMY     1 /* emulate speaker port stub */
1885	
1886	PIT timer interrupts may use a per-VM kernel thread for injection. If it
1887	exists, this thread will have a name of the following pattern:
1888	
1889	kvm-pit/<owner-process-pid>
1890	
1891	When running a guest with elevated priorities, the scheduling parameters of
1892	this thread may have to be adjusted accordingly.
1893	
1894	This IOCTL replaces the obsolete KVM_CREATE_PIT.
1895	
1896	
1897	4.72 KVM_GET_PIT2
1898	
1899	Capability: KVM_CAP_PIT_STATE2
1900	Architectures: x86
1901	Type: vm ioctl
1902	Parameters: struct kvm_pit_state2 (out)
1903	Returns: 0 on success, -1 on error
1904	
1905	Retrieves the state of the in-kernel PIT model. Only valid after
1906	KVM_CREATE_PIT2. The state is returned in the following structure:
1907	
1908	struct kvm_pit_state2 {
1909		struct kvm_pit_channel_state channels[3];
1910		__u32 flags;
1911		__u32 reserved[9];
1912	};
1913	
1914	Valid flags are:
1915	
1916	/* disable PIT in HPET legacy mode */
1917	#define KVM_PIT_FLAGS_HPET_LEGACY  0x00000001
1918	
1919	This IOCTL replaces the obsolete KVM_GET_PIT.
1920	
1921	
1922	4.73 KVM_SET_PIT2
1923	
1924	Capability: KVM_CAP_PIT_STATE2
1925	Architectures: x86
1926	Type: vm ioctl
1927	Parameters: struct kvm_pit_state2 (in)
1928	Returns: 0 on success, -1 on error
1929	
1930	Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
1931	See KVM_GET_PIT2 for details on struct kvm_pit_state2.
1932	
1933	This IOCTL replaces the obsolete KVM_SET_PIT.
1934	
1935	
1936	4.74 KVM_PPC_GET_SMMU_INFO
1937	
1938	Capability: KVM_CAP_PPC_GET_SMMU_INFO
1939	Architectures: powerpc
1940	Type: vm ioctl
1941	Parameters: None
1942	Returns: 0 on success, -1 on error
1943	
1944	This populates and returns a structure describing the features of
1945	the "Server" class MMU emulation supported by KVM.
1946	This can in turn be used by userspace to generate the appropariate
1947	device-tree properties for the guest operating system.
1948	
1949	The structure contains some global informations, followed by an
1950	array of supported segment page sizes:
1951	
1952	      struct kvm_ppc_smmu_info {
1953		     __u64 flags;
1954		     __u32 slb_size;
1955		     __u32 pad;
1956		     struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
1957	      };
1958	
1959	The supported flags are:
1960	
1961	    - KVM_PPC_PAGE_SIZES_REAL:
1962	        When that flag is set, guest page sizes must "fit" the backing
1963	        store page sizes. When not set, any page size in the list can
1964	        be used regardless of how they are backed by userspace.
1965	
1966	    - KVM_PPC_1T_SEGMENTS
1967	        The emulated MMU supports 1T segments in addition to the
1968	        standard 256M ones.
1969	
1970	The "slb_size" field indicates how many SLB entries are supported
1971	
1972	The "sps" array contains 8 entries indicating the supported base
1973	page sizes for a segment in increasing order. Each entry is defined
1974	as follow:
1975	
1976	   struct kvm_ppc_one_seg_page_size {
1977		__u32 page_shift;	/* Base page shift of segment (or 0) */
1978		__u32 slb_enc;		/* SLB encoding for BookS */
1979		struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ];
1980	   };
1981	
1982	An entry with a "page_shift" of 0 is unused. Because the array is
1983	organized in increasing order, a lookup can stop when encoutering
1984	such an entry.
1985	
1986	The "slb_enc" field provides the encoding to use in the SLB for the
1987	page size. The bits are in positions such as the value can directly
1988	be OR'ed into the "vsid" argument of the slbmte instruction.
1989	
1990	The "enc" array is a list which for each of those segment base page
1991	size provides the list of supported actual page sizes (which can be
1992	only larger or equal to the base page size), along with the
1993	corresponding encoding in the hash PTE. Similarily, the array is
1994	8 entries sorted by increasing sizes and an entry with a "0" shift
1995	is an empty entry and a terminator:
1996	
1997	   struct kvm_ppc_one_page_size {
1998		__u32 page_shift;	/* Page shift (or 0) */
1999		__u32 pte_enc;		/* Encoding in the HPTE (>>12) */
2000	   };
2001	
2002	The "pte_enc" field provides a value that can OR'ed into the hash
2003	PTE's RPN field (ie, it needs to be shifted left by 12 to OR it
2004	into the hash PTE second double word).
2005	
2006	4.75 KVM_IRQFD
2007	
2008	Capability: KVM_CAP_IRQFD
2009	Architectures: x86
2010	Type: vm ioctl
2011	Parameters: struct kvm_irqfd (in)
2012	Returns: 0 on success, -1 on error
2013	
2014	Allows setting an eventfd to directly trigger a guest interrupt.
2015	kvm_irqfd.fd specifies the file descriptor to use as the eventfd and
2016	kvm_irqfd.gsi specifies the irqchip pin toggled by this event.  When
2017	an event is tiggered on the eventfd, an interrupt is injected into
2018	the guest using the specified gsi pin.  The irqfd is removed using
2019	the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
2020	and kvm_irqfd.gsi.
2021	
2022	With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
2023	mechanism allowing emulation of level-triggered, irqfd-based
2024	interrupts.  When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
2025	additional eventfd in the kvm_irqfd.resamplefd field.  When operating
2026	in resample mode, posting of an interrupt through kvm_irq.fd asserts
2027	the specified gsi in the irqchip.  When the irqchip is resampled, such
2028	as from an EOI, the gsi is de-asserted and the user is notifed via
2029	kvm_irqfd.resamplefd.  It is the user's responsibility to re-queue
2030	the interrupt if the device making use of it still requires service.
2031	Note that closing the resamplefd is not sufficient to disable the
2032	irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
2033	and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
2034	
2035	4.76 KVM_PPC_ALLOCATE_HTAB
2036	
2037	Capability: KVM_CAP_PPC_ALLOC_HTAB
2038	Architectures: powerpc
2039	Type: vm ioctl
2040	Parameters: Pointer to u32 containing hash table order (in/out)
2041	Returns: 0 on success, -1 on error
2042	
2043	This requests the host kernel to allocate an MMU hash table for a
2044	guest using the PAPR paravirtualization interface.  This only does
2045	anything if the kernel is configured to use the Book 3S HV style of
2046	virtualization.  Otherwise the capability doesn't exist and the ioctl
2047	returns an ENOTTY error.  The rest of this description assumes Book 3S
2048	HV.
2049	
2050	There must be no vcpus running when this ioctl is called; if there
2051	are, it will do nothing and return an EBUSY error.
2052	
2053	The parameter is a pointer to a 32-bit unsigned integer variable
2054	containing the order (log base 2) of the desired size of the hash
2055	table, which must be between 18 and 46.  On successful return from the
2056	ioctl, it will have been updated with the order of the hash table that
2057	was allocated.
2058	
2059	If no hash table has been allocated when any vcpu is asked to run
2060	(with the KVM_RUN ioctl), the host kernel will allocate a
2061	default-sized hash table (16 MB).
2062	
2063	If this ioctl is called when a hash table has already been allocated,
2064	the kernel will clear out the existing hash table (zero all HPTEs) and
2065	return the hash table order in the parameter.  (If the guest is using
2066	the virtualized real-mode area (VRMA) facility, the kernel will
2067	re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.)
2068	
2069	4.77 KVM_S390_INTERRUPT
2070	
2071	Capability: basic
2072	Architectures: s390
2073	Type: vm ioctl, vcpu ioctl
2074	Parameters: struct kvm_s390_interrupt (in)
2075	Returns: 0 on success, -1 on error
2076	
2077	Allows to inject an interrupt to the guest. Interrupts can be floating
2078	(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type.
2079	
2080	Interrupt parameters are passed via kvm_s390_interrupt:
2081	
2082	struct kvm_s390_interrupt {
2083		__u32 type;
2084		__u32 parm;
2085		__u64 parm64;
2086	};
2087	
2088	type can be one of the following:
2089	
2090	KVM_S390_SIGP_STOP (vcpu) - sigp restart
2091	KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
2092	KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
2093	KVM_S390_RESTART (vcpu) - restart
2094	KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
2095				   parameters in parm and parm64
2096	KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
2097	KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
2098	KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
2099	KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
2100	    I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
2101	    I/O interruption parameters in parm (subchannel) and parm64 (intparm,
2102	    interruption subclass)
2103	KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
2104	                           machine check interrupt code in parm64 (note that
2105	                           machine checks needing further payload are not
2106	                           supported by this ioctl)
2107	
2108	Note that the vcpu ioctl is asynchronous to vcpu execution.
2109	
2110	4.78 KVM_PPC_GET_HTAB_FD
2111	
2112	Capability: KVM_CAP_PPC_HTAB_FD
2113	Architectures: powerpc
2114	Type: vm ioctl
2115	Parameters: Pointer to struct kvm_get_htab_fd (in)
2116	Returns: file descriptor number (>= 0) on success, -1 on error
2117	
2118	This returns a file descriptor that can be used either to read out the
2119	entries in the guest's hashed page table (HPT), or to write entries to
2120	initialize the HPT.  The returned fd can only be written to if the
2121	KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
2122	can only be read if that bit is clear.  The argument struct looks like
2123	this:
2124	
2125	/* For KVM_PPC_GET_HTAB_FD */
2126	struct kvm_get_htab_fd {
2127		__u64	flags;
2128		__u64	start_index;
2129		__u64	reserved[2];
2130	};
2131	
2132	/* Values for kvm_get_htab_fd.flags */
2133	#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
2134	#define KVM_GET_HTAB_WRITE		((__u64)0x2)
2135	
2136	The `start_index' field gives the index in the HPT of the entry at
2137	which to start reading.  It is ignored when writing.
2138	
2139	Reads on the fd will initially supply information about all
2140	"interesting" HPT entries.  Interesting entries are those with the
2141	bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
2142	all entries.  When the end of the HPT is reached, the read() will
2143	return.  If read() is called again on the fd, it will start again from
2144	the beginning of the HPT, but will only return HPT entries that have
2145	changed since they were last read.
2146	
2147	Data read or written is structured as a header (8 bytes) followed by a
2148	series of valid HPT entries (16 bytes) each.  The header indicates how
2149	many valid HPT entries there are and how many invalid entries follow
2150	the valid entries.  The invalid entries are not represented explicitly
2151	in the stream.  The header format is:
2152	
2153	struct kvm_get_htab_header {
2154		__u32	index;
2155		__u16	n_valid;
2156		__u16	n_invalid;
2157	};
2158	
2159	Writes to the fd create HPT entries starting at the index given in the
2160	header; first `n_valid' valid entries with contents from the data
2161	written, then `n_invalid' invalid entries, invalidating any previously
2162	valid entries found.
2163	
2164	
2165	4.77 KVM_ARM_VCPU_INIT
2166	
2167	Capability: basic
2168	Architectures: arm
2169	Type: vcpu ioctl
2170	Parameters: struct struct kvm_vcpu_init (in)
2171	Returns: 0 on success; -1 on error
2172	Errors:
2173	  EINVAL:    the target is unknown, or the combination of features is invalid.
2174	  ENOENT:    a features bit specified is unknown.
2175	
2176	This tells KVM what type of CPU to present to the guest, and what
2177	optional features it should have.  This will cause a reset of the cpu
2178	registers to their initial values.  If this is not called, KVM_RUN will
2179	return ENOEXEC for that vcpu.
2180	
2181	Note that because some registers reflect machine topology, all vcpus
2182	should be created before this ioctl is invoked.
2183	
2184	Possible features:
2185		- KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
2186		  Depends on KVM_CAP_ARM_PSCI.
2187	
2188	
2189	4.78 KVM_GET_REG_LIST
2190	
2191	Capability: basic
2192	Architectures: arm
2193	Type: vcpu ioctl
2194	Parameters: struct kvm_reg_list (in/out)
2195	Returns: 0 on success; -1 on error
2196	Errors:
2197	  E2BIG:     the reg index list is too big to fit in the array specified by
2198	             the user (the number required will be written into n).
2199	
2200	struct kvm_reg_list {
2201		__u64 n; /* number of registers in reg[] */
2202		__u64 reg[0];
2203	};
2204	
2205	This ioctl returns the guest registers that are supported for the
2206	KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
2207	
2208	
2209	4.80 KVM_ARM_SET_DEVICE_ADDR
2210	
2211	Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
2212	Architectures: arm
2213	Type: vm ioctl
2214	Parameters: struct kvm_arm_device_address (in)
2215	Returns: 0 on success, -1 on error
2216	Errors:
2217	  ENODEV: The device id is unknown
2218	  ENXIO:  Device not supported on current system
2219	  EEXIST: Address already set
2220	  E2BIG:  Address outside guest physical address space
2221	  EBUSY:  Address overlaps with other device range
2222	
2223	struct kvm_arm_device_addr {
2224		__u64 id;
2225		__u64 addr;
2226	};
2227	
2228	Specify a device address in the guest's physical address space where guests
2229	can access emulated or directly exposed devices, which the host kernel needs
2230	to know about. The id field is an architecture specific identifier for a
2231	specific device.
2232	
2233	ARM divides the id field into two parts, a device id and an address type id
2234	specific to the individual device.
2235	
2236	  bits:  | 63        ...       32 | 31    ...    16 | 15    ...    0 |
2237	  field: |        0x00000000      |     device id   |  addr type id  |
2238	
2239	ARM currently only require this when using the in-kernel GIC support for the
2240	hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2 as the device id.  When
2241	setting the base address for the guest's mapping of the VGIC virtual CPU
2242	and distributor interface, the ioctl must be called after calling
2243	KVM_CREATE_IRQCHIP, but before calling KVM_RUN on any of the VCPUs.  Calling
2244	this ioctl twice for any of the base addresses will return -EEXIST.
2245	
2246	
2247	5. The kvm_run structure
2248	------------------------
2249	
2250	Application code obtains a pointer to the kvm_run structure by
2251	mmap()ing a vcpu fd.  From that point, application code can control
2252	execution by changing fields in kvm_run prior to calling the KVM_RUN
2253	ioctl, and obtain information about the reason KVM_RUN returned by
2254	looking up structure members.
2255	
2256	struct kvm_run {
2257		/* in */
2258		__u8 request_interrupt_window;
2259	
2260	Request that KVM_RUN return when it becomes possible to inject external
2261	interrupts into the guest.  Useful in conjunction with KVM_INTERRUPT.
2262	
2263		__u8 padding1[7];
2264	
2265		/* out */
2266		__u32 exit_reason;
2267	
2268	When KVM_RUN has returned successfully (return value 0), this informs
2269	application code why KVM_RUN has returned.  Allowable values for this
2270	field are detailed below.
2271	
2272		__u8 ready_for_interrupt_injection;
2273	
2274	If request_interrupt_window has been specified, this field indicates
2275	an interrupt can be injected now with KVM_INTERRUPT.
2276	
2277		__u8 if_flag;
2278	
2279	The value of the current interrupt flag.  Only valid if in-kernel
2280	local APIC is not used.
2281	
2282		__u8 padding2[2];
2283	
2284		/* in (pre_kvm_run), out (post_kvm_run) */
2285		__u64 cr8;
2286	
2287	The value of the cr8 register.  Only valid if in-kernel local APIC is
2288	not used.  Both input and output.
2289	
2290		__u64 apic_base;
2291	
2292	The value of the APIC BASE msr.  Only valid if in-kernel local
2293	APIC is not used.  Both input and output.
2294	
2295		union {
2296			/* KVM_EXIT_UNKNOWN */
2297			struct {
2298				__u64 hardware_exit_reason;
2299			} hw;
2300	
2301	If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
2302	reasons.  Further architecture-specific information is available in
2303	hardware_exit_reason.
2304	
2305			/* KVM_EXIT_FAIL_ENTRY */
2306			struct {
2307				__u64 hardware_entry_failure_reason;
2308			} fail_entry;
2309	
2310	If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
2311	to unknown reasons.  Further architecture-specific information is
2312	available in hardware_entry_failure_reason.
2313	
2314			/* KVM_EXIT_EXCEPTION */
2315			struct {
2316				__u32 exception;
2317				__u32 error_code;
2318			} ex;
2319	
2320	Unused.
2321	
2322			/* KVM_EXIT_IO */
2323			struct {
2324	#define KVM_EXIT_IO_IN  0
2325	#define KVM_EXIT_IO_OUT 1
2326				__u8 direction;
2327				__u8 size; /* bytes */
2328				__u16 port;
2329				__u32 count;
2330				__u64 data_offset; /* relative to kvm_run start */
2331			} io;
2332	
2333	If exit_reason is KVM_EXIT_IO, then the vcpu has
2334	executed a port I/O instruction which could not be satisfied by kvm.
2335	data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
2336	where kvm expects application code to place the data for the next
2337	KVM_RUN invocation (KVM_EXIT_IO_IN).  Data format is a packed array.
2338	
2339			struct {
2340				struct kvm_debug_exit_arch arch;
2341			} debug;
2342	
2343	Unused.
2344	
2345			/* KVM_EXIT_MMIO */
2346			struct {
2347				__u64 phys_addr;
2348				__u8  data[8];
2349				__u32 len;
2350				__u8  is_write;
2351			} mmio;
2352	
2353	If exit_reason is KVM_EXIT_MMIO, then the vcpu has
2354	executed a memory-mapped I/O instruction which could not be satisfied
2355	by kvm.  The 'data' member contains the written data if 'is_write' is
2356	true, and should be filled by application code otherwise.
2357	
2358	NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR,
2359	      KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
2360	operations are complete (and guest state is consistent) only after userspace
2361	has re-entered the kernel with KVM_RUN.  The kernel side will first finish
2362	incomplete operations and then check for pending signals.  Userspace
2363	can re-enter the guest with an unmasked signal pending to complete
2364	pending operations.
2365	
2366			/* KVM_EXIT_HYPERCALL */
2367			struct {
2368				__u64 nr;
2369				__u64 args[6];
2370				__u64 ret;
2371				__u32 longmode;
2372				__u32 pad;
2373			} hypercall;
2374	
2375	Unused.  This was once used for 'hypercall to userspace'.  To implement
2376	such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
2377	Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
2378	
2379			/* KVM_EXIT_TPR_ACCESS */
2380			struct {
2381				__u64 rip;
2382				__u32 is_write;
2383				__u32 pad;
2384			} tpr_access;
2385	
2386	To be documented (KVM_TPR_ACCESS_REPORTING).
2387	
2388			/* KVM_EXIT_S390_SIEIC */
2389			struct {
2390				__u8 icptcode;
2391				__u64 mask; /* psw upper half */
2392				__u64 addr; /* psw lower half */
2393				__u16 ipa;
2394				__u32 ipb;
2395			} s390_sieic;
2396	
2397	s390 specific.
2398	
2399			/* KVM_EXIT_S390_RESET */
2400	#define KVM_S390_RESET_POR       1
2401	#define KVM_S390_RESET_CLEAR     2
2402	#define KVM_S390_RESET_SUBSYSTEM 4
2403	#define KVM_S390_RESET_CPU_INIT  8
2404	#define KVM_S390_RESET_IPL       16
2405			__u64 s390_reset_flags;
2406	
2407	s390 specific.
2408	
2409			/* KVM_EXIT_S390_UCONTROL */
2410			struct {
2411				__u64 trans_exc_code;
2412				__u32 pgm_code;
2413			} s390_ucontrol;
2414	
2415	s390 specific. A page fault has occurred for a user controlled virtual
2416	machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be
2417	resolved by the kernel.
2418	The program code and the translation exception code that were placed
2419	in the cpu's lowcore are presented here as defined by the z Architecture
2420	Principles of Operation Book in the Chapter for Dynamic Address Translation
2421	(DAT)
2422	
2423			/* KVM_EXIT_DCR */
2424			struct {
2425				__u32 dcrn;
2426				__u32 data;
2427				__u8  is_write;
2428			} dcr;
2429	
2430	powerpc specific.
2431	
2432			/* KVM_EXIT_OSI */
2433			struct {
2434				__u64 gprs[32];
2435			} osi;
2436	
2437	MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch
2438	hypercalls and exit with this exit struct that contains all the guest gprs.
2439	
2440	If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall.
2441	Userspace can now handle the hypercall and when it's done modify the gprs as
2442	necessary. Upon guest entry all guest GPRs will then be replaced by the values
2443	in this struct.
2444	
2445			/* KVM_EXIT_PAPR_HCALL */
2446			struct {
2447				__u64 nr;
2448				__u64 ret;
2449				__u64 args[9];
2450			} papr_hcall;
2451	
2452	This is used on 64-bit PowerPC when emulating a pSeries partition,
2453	e.g. with the 'pseries' machine type in qemu.  It occurs when the
2454	guest does a hypercall using the 'sc 1' instruction.  The 'nr' field
2455	contains the hypercall number (from the guest R3), and 'args' contains
2456	the arguments (from the guest R4 - R12).  Userspace should put the
2457	return code in 'ret' and any extra returned values in args[].
2458	The possible hypercalls are defined in the Power Architecture Platform
2459	Requirements (PAPR) document available from www.power.org (free
2460	developer registration required to access it).
2461	
2462			/* KVM_EXIT_S390_TSCH */
2463			struct {
2464				__u16 subchannel_id;
2465				__u16 subchannel_nr;
2466				__u32 io_int_parm;
2467				__u32 io_int_word;
2468				__u32 ipb;
2469				__u8 dequeued;
2470			} s390_tsch;
2471	
2472	s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled
2473	and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O
2474	interrupt for the target subchannel has been dequeued and subchannel_id,
2475	subchannel_nr, io_int_parm and io_int_word contain the parameters for that
2476	interrupt. ipb is needed for instruction parameter decoding.
2477	
2478			/* KVM_EXIT_EPR */
2479			struct {
2480				__u32 epr;
2481			} epr;
2482	
2483	On FSL BookE PowerPC chips, the interrupt controller has a fast patch
2484	interrupt acknowledge path to the core. When the core successfully
2485	delivers an interrupt, it automatically populates the EPR register with
2486	the interrupt vector number and acknowledges the interrupt inside
2487	the interrupt controller.
2488	
2489	In case the interrupt controller lives in user space, we need to do
2490	the interrupt acknowledge cycle through it to fetch the next to be
2491	delivered interrupt vector using this exit.
2492	
2493	It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
2494	external interrupt has just been delivered into the guest. User space
2495	should put the acknowledged interrupt vector into the 'epr' field.
2496	
2497			/* Fix the size of the union. */
2498			char padding[256];
2499		};
2500	
2501		/*
2502		 * shared registers between kvm and userspace.
2503		 * kvm_valid_regs specifies the register classes set by the host
2504		 * kvm_dirty_regs specified the register classes dirtied by userspace
2505		 * struct kvm_sync_regs is architecture specific, as well as the
2506		 * bits for kvm_valid_regs and kvm_dirty_regs
2507		 */
2508		__u64 kvm_valid_regs;
2509		__u64 kvm_dirty_regs;
2510		union {
2511			struct kvm_sync_regs regs;
2512			char padding[1024];
2513		} s;
2514	
2515	If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access
2516	certain guest registers without having to call SET/GET_*REGS. Thus we can
2517	avoid some system call overhead if userspace has to handle the exit.
2518	Userspace can query the validity of the structure by checking
2519	kvm_valid_regs for specific bits. These bits are architecture specific
2520	and usually define the validity of a groups of registers. (e.g. one bit
2521	 for general purpose registers)
2522	
2523	};
2524	
2525	
2526	6. Capabilities that can be enabled
2527	-----------------------------------
2528	
2529	There are certain capabilities that change the behavior of the virtual CPU when
2530	enabled. To enable them, please see section 4.37. Below you can find a list of
2531	capabilities and what their effect on the vCPU is when enabling them.
2532	
2533	The following information is provided along with the description:
2534	
2535	  Architectures: which instruction set architectures provide this ioctl.
2536	      x86 includes both i386 and x86_64.
2537	
2538	  Parameters: what parameters are accepted by the capability.
2539	
2540	  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
2541	      are not detailed, but errors with specific meanings are.
2542	
2543	
2544	6.1 KVM_CAP_PPC_OSI
2545	
2546	Architectures: ppc
2547	Parameters: none
2548	Returns: 0 on success; -1 on error
2549	
2550	This capability enables interception of OSI hypercalls that otherwise would
2551	be treated as normal system calls to be injected into the guest. OSI hypercalls
2552	were invented by Mac-on-Linux to have a standardized communication mechanism
2553	between the guest and the host.
2554	
2555	When this capability is enabled, KVM_EXIT_OSI can occur.
2556	
2557	
2558	6.2 KVM_CAP_PPC_PAPR
2559	
2560	Architectures: ppc
2561	Parameters: none
2562	Returns: 0 on success; -1 on error
2563	
2564	This capability enables interception of PAPR hypercalls. PAPR hypercalls are
2565	done using the hypercall instruction "sc 1".
2566	
2567	It also sets the guest privilege level to "supervisor" mode. Usually the guest
2568	runs in "hypervisor" privilege mode with a few missing features.
2569	
2570	In addition to the above, it changes the semantics of SDR1. In this mode, the
2571	HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the
2572	HTAB invisible to the guest.
2573	
2574	When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur.
2575	
2576	
2577	6.3 KVM_CAP_SW_TLB
2578	
2579	Architectures: ppc
2580	Parameters: args[0] is the address of a struct kvm_config_tlb
2581	Returns: 0 on success; -1 on error
2582	
2583	struct kvm_config_tlb {
2584		__u64 params;
2585		__u64 array;
2586		__u32 mmu_type;
2587		__u32 array_len;
2588	};
2589	
2590	Configures the virtual CPU's TLB array, establishing a shared memory area
2591	between userspace and KVM.  The "params" and "array" fields are userspace
2592	addresses of mmu-type-specific data structures.  The "array_len" field is an
2593	safety mechanism, and should be set to the size in bytes of the memory that
2594	userspace has reserved for the array.  It must be at least the size dictated
2595	by "mmu_type" and "params".
2596	
2597	While KVM_RUN is active, the shared region is under control of KVM.  Its
2598	contents are undefined, and any modification by userspace results in
2599	boundedly undefined behavior.
2600	
2601	On return from KVM_RUN, the shared region will reflect the current state of
2602	the guest's TLB.  If userspace makes any changes, it must call KVM_DIRTY_TLB
2603	to tell KVM which entries have been changed, prior to calling KVM_RUN again
2604	on this vcpu.
2605	
2606	For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
2607	 - The "params" field is of type "struct kvm_book3e_206_tlb_params".
2608	 - The "array" field points to an array of type "struct
2609	   kvm_book3e_206_tlb_entry".
2610	 - The array consists of all entries in the first TLB, followed by all
2611	   entries in the second TLB.
2612	 - Within a TLB, entries are ordered first by increasing set number.  Within a
2613	   set, entries are ordered by way (increasing ESEL).
2614	 - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1)
2615	   where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value.
2616	 - The tsize field of mas1 shall be set to 4K on TLB0, even though the
2617	   hardware ignores this value for TLB0.
2618	
2619	6.4 KVM_CAP_S390_CSS_SUPPORT
2620	
2621	Architectures: s390
2622	Parameters: none
2623	Returns: 0 on success; -1 on error
2624	
2625	This capability enables support for handling of channel I/O instructions.
2626	
2627	TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are
2628	handled in-kernel, while the other I/O instructions are passed to userspace.
2629	
2630	When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
2631	SUBCHANNEL intercepts.
2632	
2633	6.5 KVM_CAP_PPC_EPR
2634	
2635	Architectures: ppc
2636	Parameters: args[0] defines whether the proxy facility is active
2637	Returns: 0 on success; -1 on error
2638	
2639	This capability enables or disables the delivery of interrupts through the
2640	external proxy facility.
2641	
2642	When enabled (args[0] != 0), every time the guest gets an external interrupt
2643	delivered, it automatically exits into user space with a KVM_EXIT_EPR exit
2644	to receive the topmost interrupt vector.
2645	
2646	When disabled (args[0] == 0), behavior is as if this facility is unsupported.
2647	
2648	When this capability is enabled, KVM_EXIT_EPR can occur.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.