KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"
authorWanpeng Li <wanpeng.li@hotmail.com>
Tue, 1 Aug 2017 02:25:27 +0000 (19:25 -0700)
committerRadim Krčmář <rkrcmar@redhat.com>
Thu, 3 Aug 2017 13:38:11 +0000 (15:38 +0200)
------------[ cut here ]------------
 WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
 CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7
 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
Call Trace:
  vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
  ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm]
  ? vmx_vcpu_load+0x1be/0x220 [kvm_intel]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x8f/0x750
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL64_slow_path+0x25/0x25

This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which
means that tells the kernel to not make use of any IOAPICs that may be present
in the system.

Actually external_intr variable in nested_vmx_vmexit() is the req_int_win
variable passed from vcpu_enter_guest() which means that the L0's userspace
requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) &&
L0's userspace reqeusts an irq window) is true, so there is no interrupt which
L1 requires to inject to L2, we should not attempt to emualte "Acknowledge
interrupt on exit" for the irq window requirement in this scenario.

This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit"
if there is no L1 requirement to inject an interrupt to L2.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Added code comment to make it obvious that the behavior is not correct.
 We should do a userspace exit with open interrupt window instead of the
 nested VM exit.  This patch still improves the behavior, so it was
 accepted as a (temporary) workaround.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
arch/x86/kvm/vmx.c

index cbad87e7e829035829599dae4319d65f2c876cf7..9b21b12230354e334900e6536b7612285f75b7e3 100644 (file)
@@ -11131,8 +11131,15 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 
        vmx_switch_vmcs(vcpu, &vmx->vmcs01);
 
-       if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
-           && nested_exit_intr_ack_set(vcpu)) {
+       /*
+        * TODO: SDM says that with acknowledge interrupt on exit, bit 31 of
+        * the VM-exit interrupt information (valid interrupt) is always set to
+        * 1 on EXIT_REASON_EXTERNAL_INTERRUPT, so we shouldn't need
+        * kvm_cpu_has_interrupt().  See the commit message for details.
+        */
+       if (nested_exit_intr_ack_set(vcpu) &&
+           exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
+           kvm_cpu_has_interrupt(vcpu)) {
                int irq = kvm_cpu_get_interrupt(vcpu);
                WARN_ON(irq < 0);
                vmcs12->vm_exit_intr_info = irq |