From: Carlos López The TPR_THRESHOLD field in the VMCS is used by VMX to induce VM exits when the guest's virtual TPR falls under the specified threshold, allowing KVM to inject previously masked interrupts. KVM handles these VM exits in handle_tpr_below_threshold(). Commit eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits") optimized this function by calling apic_update_ppr() instead of raising KVM_REQ_EVENT. apic_update_ppr() then raises KVM_REQ_EVENT if there is a pending, deliverable interrupt. However, if there are no new interrupts pending, apic_update_ppr() does not issue the request. Thus, kvm_lapic_update_cr8_intercept() and vmx_update_cr8_intercept() are not called before VM entry, which results in a high, stale TPR_THRESHOLD. This is problematic due to the following sentence in 28.2.1.1 "VM-Execution Control Fields" in the SDM: The following check is performed if the “use TPR shadow” VM-execution control is 1 and the “virtualize APIC accesses” and “virtual-interrupt delivery” VM-execution controls are both 0: the value of bits 3:0 of the TPR threshold VM-execution control field should not be greater than the value of bits 7:4 of VTPR. This error condition is typically not observed when KVM runs on a bare metal system because modern processors support APICv, which enables virtual-interrupt delivery, and which KVM uses when possible. This causes the processor to no longer generate TPR-below-threshold exits and to no longer check TPR_THRESHOLD on entry. However, when running on older platforms, or under nested virtualization on a hypervisor that does not support virtual-interrupt delivery and enforces this check (like Hyper-V) this can cause a VM entry failure with hardware error 0x7, as seen in [1]. Call kvm_lapic_update_cr8_intercept() if apic_update_ppr() does not find a deliverable interrupt (and thus does not raise KVM_REQ_EVENT). Remove calls to kvm_lapic_update_cr8_intercept() on paths that end up in apic_update_ppr(), as they now become redundant. This ensures that any path that updates the guest's PPR also figures out if KVM needs to wait for a TPR change (using TPR_THRESHOLD on VMX or CR8 intercepts on SVM). Link: https://github.com/coconut-svsm/svsm/issues/1081 [1] Tested-by: Stefano Garzarella Cc: stable@vger.kernel.org Fixes: eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits") Signed-off-by: Carlos López Signed-off-by: Sean Christopherson --- arch/x86/kvm/lapic.c | 2 ++ arch/x86/kvm/x86.c | 5 +---- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 9d2df8623f6d..f6a289d01a26 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -980,6 +980,8 @@ static void apic_update_ppr(struct kvm_lapic *apic) if (__apic_update_ppr(apic, &ppr) && apic_has_interrupt_for_ppr(apic, ppr) != -1) kvm_make_request(KVM_REQ_EVENT, apic->vcpu); + else + kvm_lapic_update_cr8_intercept(apic->vcpu); } void kvm_apic_update_ppr(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d9d51803b7b2..96c465040756 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5317,7 +5317,6 @@ static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu, r = kvm_apic_set_state(vcpu, s); if (r) return r; - kvm_lapic_update_cr8_intercept(vcpu); return 0; } @@ -12418,8 +12417,6 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, kvm_register_mark_dirty(vcpu, VCPU_REG_CR3); kvm_x86_call(post_set_cr3)(vcpu, sregs->cr3); - kvm_set_cr8(vcpu, sregs->cr8); - *mmu_reset_needed |= vcpu->arch.efer != sregs->efer; kvm_x86_call(set_efer)(vcpu, sregs->efer); @@ -12448,7 +12445,7 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, kvm_set_segment(vcpu, &sregs->tr, VCPU_SREG_TR); kvm_set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR); - kvm_lapic_update_cr8_intercept(vcpu); + kvm_set_cr8(vcpu, sregs->cr8); /* Older userspace won't unhalt the vcpu on reset. */ if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 && -- 2.55.0.rc0.738.g0c8ab3ebcc-goog