aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAgeFilesLines
* KVM: Convert kvm_lock to raw_spinlockJan Kiszka2011-03-172-4/+4
| | | | | | | | Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: check for progress after IRET interceptionAvi Kivity2011-03-171-1/+9
| | | | | | | | | | | | | | | | | | When we enable an NMI window, we ask for an IRET intercept, since the IRET re-enables NMIs. However, the IRET intercept happens before the instruction executes, while the NMI window architecturally opens afterwards. To compensate for this mismatch, we only open the NMI window in the following exit, assuming that the IRET has by then executed; however, this assumption is not always correct; we may exit due to a host interrupt or page fault, without having executed the instruction. Fix by checking for forward progress by recording and comparing the IRET's rip. This is somewhat of a hack, since an unchaging rip does not mean that no forward progress has been made, but is the simplest fix for now. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix race between nmi injection and enabling nmi windowAvi Kivity2011-03-171-1/+3
| | | | | | | | | | | | | | | | | | | | | The interrupt injection logic looks something like if an nmi is pending, and nmi injection allowed inject nmi if an nmi is pending request exit on nmi window the problem is that "nmi is pending" can be set asynchronously by the PIT; if it happens to fire between the two if statements, we will request an nmi window even though nmi injection is allowed. On SVM, this has disasterous results, since it causes eflags.TF to be set in random guest code. The fix is simple; make nmi_pending synchronous using the standard vcpu->requests mechanism; this ensures the code above is completely synchronous wrt nmi_pending. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Drop ad-hoc vendor specific instruction restrictionAvi Kivity2011-03-171-28/+5
| | | | | | | Use the new support in the emulator, and drop the ad-hoc code in x86.c. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86 emulator: vendor specific instructionsAvi Kivity2011-03-171-3/+9
| | | | | | | | | Mark some instructions as vendor specific, and allow the caller to request emulation only of vendor specific instructions. This is useful in some circumstances (responding to a #UD fault). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Drop bogus x86_decode_insn() error checkAvi Kivity2011-03-171-3/+0
| | | | | | | | | | x86_decode_insn() doesn't return X86EMUL_* values, so the check for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper check later on, so there is no need for a replacement for this code. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: Drop obsolete warning about INIT on runnable VCPUJan Kiszka2011-03-171-4/+0
| | | | | | | | | | This warning was once used for debugging QEMU user space. Though uncommon, it is actually possible to send an INIT request to a running VCPU. So better drop this warning before someone misuses it to flood kernel logs this way. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: release kvmclock page on resetGlauber Costa2011-03-171-8/+12
| | | | | | | | | | When a vcpu is reset, kvmclock page keeps being written to this days. This is wrong and inconsistent: a cpu reset should take it to its initial state. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: handle guest access to BBL_CR_CTL3 MSRjohn cooper2011-03-171-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Add "exiting guest mode" stateXiao Guangrong2011-03-171-6/+10
| | | | | | | | | | | | | | | | Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Remove user space triggerable MCE error messageJan Kiszka2011-03-171-3/+0
| | | | | | | | This case is a pure user space error we do not need to record. Moreover, it can be misused to flood the kernel log. Remove it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: fix rcu usage warning in kvm_arch_vcpu_ioctl_set_sregs()Xiao Guangrong2011-03-171-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | Fix: [ 1001.499596] =================================================== [ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ] [ 1001.499601] --------------------------------------------------- [ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection! ...... [ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62 [ 1001.499638] Call Trace: [ 1001.499644] [] lockdep_rcu_dereference+0x9d/0xa5 [ 1001.499653] [] gfn_to_memslot+0x8d/0xc8 [kvm] [ 1001.499661] [] gfn_to_hva+0x16/0x3f [kvm] [ 1001.499669] [] kvm_read_guest_page+0x1e/0x5e [kvm] [ 1001.499681] [] kvm_read_guest_page_mmu+0x53/0x5e [kvm] [ 1001.499699] [] load_pdptrs+0x3f/0x9c [kvm] [ 1001.499705] [] ? vmx_set_cr0+0x507/0x517 [kvm_intel] [ 1001.499717] [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm] [ 1001.499727] [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Avoid atomic operation in vmx_vcpu_runAvi Kivity2011-03-171-2/+5
| | | | | | | | | Instead of exchanging the guest and host rcx, have separate storage for each. This allows us to avoid using the xchg instruction, which is is a little slower than normal operations. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Simplify saving guest rcx in vmx_vcpu_runAvi Kivity2011-03-171-2/+2
| | | | | | | | | | | | | | | | | Change push top-of-stack pop guest-rcx pop dummy to pop guest-rcx which is the same thing, only simpler. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: increase ple_gap default to 128Rik van Riel2011-03-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some CPUs, a ple_gap of 41 is simply insufficient to ever trigger PLE exits, even with the minimalistic PLE test from kvm-unit-tests. http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=commitdiff;h=eda71b28fa122203e316483b35f37aaacd42f545 For example, the Xeon X5670 CPU needs a ple_gap of at least 48 in order to get pause loop exits: # modprobe kvm_intel ple_gap=47 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 58298446 # rmmod kvm_intel # modprobe kvm_intel ple_gap=48 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 36616 Increase the ple_gap to 128 to be on the safe side. Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Zhai, Edwin <edwin.zhai@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Add support for perf-kvmJoerg Roedel2011-03-171-2/+10
| | | | | | | | This patch adds the necessary code to run perf-kvm on AMD machines. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Avoid leaking fake realmode state to userspaceAvi Kivity2011-03-171-7/+36
| | | | | | | | | | | | | When emulating real mode, we fake some state: - tr.base points to a fake vm86 tss - segment registers are made to conform to vm86 restrictions change vmx_get_segment() not to expose this fake state to userspace; instead, return the original state. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: VMX: Save and restore tr selector across mode switchesAvi Kivity2011-03-171-0/+2
| | | | | | | | | | | | | When emulating real mode we play with tr hidden state, but leave tr.selector alone. That works well, except for save/restore, since loading TR writes it to the hidden state in vmx->rmode. Fix by also saving and restoring the tr selector; this makes things more consistent and allows migration to work during the early boot stages of Windows XP. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: MMU: Don't flush shadow when enabling dirty trackingAvi Kivity2011-03-171-4/+8
| | | | | | | Instead, drop large mappings, which were the reason we dropped shadow. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* tracing: Fix event alignment: kvm:kvm_hv_hypercallDavid Sharp2011-03-101-4/+4
| | | | | | | Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291421609-14665-8-git-send-email-dhsharp@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* KVM: SVM: Advance instruction pointer in dr_interceptJoerg Roedel2011-02-221-0/+2
| | | | | | | | | | | In the dr_intercept function a new cpu-feature called decode-assists is implemented and used when available. This code-path does not advance the guest-rip causing the guest to dead-loop over mov-dr instructions. This is fixed by this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Make sure KERNEL_GS_BASE is valid when loading gs_indexJoerg Roedel2011-02-091-1/+1
| | | | | | | | | | | | | | | | | The gs_index loading code uses the swapgs instruction to switch to the user gs_base temporarily. This is unsave in an lightweight exit-path in KVM on AMD because the KERNEL_GS_BASE MSR is switches lazily. An NMI happening in the critical path of load_gs_index may use the wrong GS_BASE value then leading to unpredictable behavior, e.g. a triple-fault. This patch fixes the issue by making sure that load_gs_index is called only with a valid KERNEL_GS_BASE value loaded in KVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* thp: mmu_notifier_test_youngAndrea Arcangeli2011-01-131-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For GRU and EPT, we need gup-fast to set referenced bit too (this is why it's correct to return 0 when shadow_access_mask is zero, it requires gup-fast to set the referenced bit). qemu-kvm access already sets the young bit in the pte if it isn't zero-copy, if it's zero copy or a shadow paging EPT minor fault we relay on gup-fast to signal the page is in use... We also need to check the young bits on the secondary pagetables for NPT and not nested shadow mmu as the data may never get accessed again by the primary pte. Without this closer accuracy, we'd have to remove the heuristic that avoids collapsing hugepages in hugepage virtual regions that have not even a single subpage in use. ->test_young is full backwards compatible with GRU and other usages that don't have young bits in pagetables set by the hardware and that should nuke the secondary mmu mappings when ->clear_flush_young runs just like EPT does. Removing the heuristic that checks the young bit in khugepaged/collapse_huge_page completely isn't so bad either probably but I thought it was worth it and this makes it reliable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* thp: kvm mmu transparent hugepage supportAndrea Arcangeli2011-01-132-17/+83
| | | | | | | | | | | This should work for both hugetlbfs and transparent hugepages. [akpm@linux-foundation.org: bring forward PageTransCompound() addition for bisectability] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2011-01-1312-859/+1486
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits) KVM: Initialize fpu state in preemptible context KVM: VMX: when entering real mode align segment base to 16 bytes KVM: MMU: handle 'map_writable' in set_spte() function KVM: MMU: audit: allow audit more guests at the same time KVM: Fetch guest cr3 from hardware on demand KVM: Replace reads of vcpu->arch.cr3 by an accessor KVM: MMU: only write protect mappings at pagetable level KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear() KVM: MMU: Initialize base_role for tdp mmus KVM: VMX: Optimize atomic EFER load KVM: VMX: Add definitions for more vm entry/exit control bits KVM: SVM: copy instruction bytes from VMCB KVM: SVM: implement enhanced INVLPG intercept KVM: SVM: enhance mov DR intercept handler KVM: SVM: enhance MOV CR intercept handler KVM: SVM: add new SVM feature bit names KVM: cleanup emulate_instruction KVM: move complete_insn_gp() into x86.c KVM: x86: fix CR8 handling KVM guest: Fix kvm clock initialization when it's configured out ...
| * KVM: Initialize fpu state in preemptible contextAvi Kivity2011-01-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | init_fpu() (which is indirectly called by the fpu switching code) assumes it is in process context. Rather than makeing init_fpu() use an atomic allocation, which can cause a task to be killed, make sure the fpu is already initialized when we enter the run loop. KVM-Stable-Tag. Reported-and-tested-by: Kirill A. Shutemov <kas@openvz.org> Acked-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: when entering real mode align segment base to 16 bytesGleb Natapov2011-01-121-1/+5
| | | | | | | | | | | | | | | | VMX checks that base is equal segment shifted 4 bits left. Otherwise guest entry fails. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: handle 'map_writable' in set_spte() functionXiao Guangrong2011-01-122-11/+4
| | | | | | | | | | | | | | | | | | Move the operation of 'writable' to set_spte() to clean up code [avi: remove unneeded booleanification] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: audit: allow audit more guests at the same timeXiao Guangrong2011-01-122-30/+35
| | | | | | | | | | | | | | | | | | | | | | | | It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Fetch guest cr3 from hardware on demandAvi Kivity2011-01-124-6/+21
| | | | | | | | | | | | | | | | | | | | Instead of syncing the guest cr3 every exit, which is expensince on vmx with ept enabled, sync it only on demand. [sheng: fix incorrect cr3 seen by Windows XP] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Replace reads of vcpu->arch.cr3 by an accessorAvi Kivity2011-01-125-20/+27
| | | | | | | | | | | | This allows us to keep cr3 in the VMCS, later on. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: only write protect mappings at pagetable levelMarcelo Tosatti2011-01-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | If a pagetable contains a writeable large spte, all of its sptes will be write protected, including non-leaf ones, leading to endless pagefaults. Do not write protect pages above PT_PAGE_TABLE_LEVEL, as the spte fault paths assume non-leaf sptes are writable. Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear()Avi Kivity2011-01-121-2/+2
| | | | | | | | | | | | | | | | 'error' is byte sized, so use a byte register constraint. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: MMU: Initialize base_role for tdp mmusAvi Kivity2011-01-121-0/+1
| | | | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Optimize atomic EFER loadAvi Kivity2011-01-121-0/+30
| | | | | | | | | | | | | | | | | | When NX is enabled on the host but not on the guest, we use the entry/exit msr load facility, which is slow. Optimize it to use entry/exit efer load, which is ~1200 cycles faster. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: copy instruction bytes from VMCBAndre Przywara2011-01-125-9/+17
| | | | | | | | | | | | | | | | | | | | | | In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: implement enhanced INVLPG interceptAndre Przywara2011-01-121-1/+6
| | | | | | | | | | | | | | | | | | | | When the DecodeAssist feature is available, the linear address is provided in the VMCB on INVLPG intercepts. Use it directly to avoid any decoding and emulation. This is only useful for shadow paging, though. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: enhance mov DR intercept handlerAndre Przywara2011-01-121-16/+40
| | | | | | | | | | | | | | | | | | | | Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle debug register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: enhance MOV CR intercept handlerAndre Przywara2011-01-121-11/+79
| | | | | | | | | | | | | | | | | | | | Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle CR register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: add new SVM feature bit namesAndre Przywara2011-01-121-0/+4
| | | | | | | | | | | | | | | | the recent APM Vol.2 and the recent AMD CPUID specification describe new CPUID features bits for SVM. Name them here for later usage. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: cleanup emulate_instructionAndre Przywara2011-01-124-20/+19
| | | | | | | | | | | | | | | | | | | | emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: move complete_insn_gp() into x86.cAndre Przywara2011-01-122-12/+13
| | | | | | | | | | | | | | | | move the complete_insn_gp() helper function out of the VMX part into the generic x86 part to make it usable by SVM. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86: fix CR8 handlingAndre Przywara2011-01-123-15/+14
| | | | | | | | | | | | | | | | | | | | The handling of CR8 writes in KVM is currently somewhat cumbersome. This patch makes it look like the other CR register handlers and fixes a possible issue in VMX, where the RIP would be incremented despite an injected #GP. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: Take missing slots_lock for kvm_io_bus_unregister_dev()Takuya Yoshikawa2011-01-121-0/+4
| | | | | | | | | | | | | | | | In KVM_CREATE_IRQCHIP, kvm_io_bus_unregister_dev() is called without taking slots_lock in the error handling path. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: return true when user space query KVM_CAP_USER_NMI extensionLai Jiangshan2011-01-121-0/+1
| | | | | | | | | | | | | | userspace may check this extension in runtime. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Correct kvm_pio tracepoint count fieldAvi Kivity2011-01-121-2/+2
| | | | | | | | | | | | Currently, we record '1' for count regardless of the real count. Fix. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: Fix incorrect direct page write protection due to ro host pageAvi Kivity2011-01-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If KVM sees a read-only host page, it will map it as read-only to prevent breaking a COW. However, if the page was part of a large guest page, KVM incorrectly extends the write protection to the entire large page frame instead of limiting it to the normal host page. This results in the instantiation of a new shadow page with read-only access. If this happens for a MOVS instruction that moves memory between two normal pages, within a single large page frame, and mapped within the guest as a large page, and if, in addition, the source operand is not writeable in the host (perhaps due to KSM), then KVM will instantiate a read-only direct shadow page, instantiate an spte for the source operand, then instantiate a new read/write direct shadow page and instantiate an spte for the destination operand. Since these two sptes are in different shadow pages, MOVS will never see them at the same time and the guest will not make progress. Fix by mapping the direct shadow page read/write, and only marking the host page read-only. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add xsetbv interceptJoerg Roedel2011-01-121-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the xsetbv intercept to the AMD part of KVM. This makes AVX usable in a save way for the guest on AVX capable AMD hardware. The patch is tested by using AVX in the guest and host in parallel and checking for data corruption. I also used the KVM xsave unit-tests and they all pass. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: Make the way of accessing lpage_info more genericTakuya Yoshikawa2011-01-121-29/+25
| | | | | | | | | | | | | | | | | | | | | | | | Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: add module parameter to avoid trapping HLT instructions (v5)Anthony Liguori2011-01-121-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct is to simply avoid trapping the HLT instruction which lets the guest directly execute the instruction putting the processor to sleep. Introduce this as a module-level option for kvm-vmx.ko since if you do this for one guest, you probably want to do it for all. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>