Immediately synchronize the user-return MSR values after a successful VP.ENTER to minimize the window where KVM is tracking stale values in the "curr" field, and so that the tracked value is synchronized before IRQs are enabled. This is *very* technically a bug fix, as a forced shutdown/reboot will invoke kvm_shutdown() without waiting for tasks to be frozen, and so the on_each_cpu() calls to kvm_disable_virtualization_cpu() will call kvm_on_user_return() from IRQ context and thus could consume a stale values->curr if the IRQ hits while KVM is active. That said, the real motivation is to minimize the window where "curr" is stale, as the same forced shutdown/reboot flaw has effectively existed for all of non-TDX for years, as kvm_set_user_return_msr() runs with IRQs enabled. Not to mention that a stale MSR is the least of the kernel's concerns if a reboot is forced while KVM is active. Fixes: e0b4f31a3c65 ("KVM: TDX: restore user ret MSRs") Cc: Yan Zhao Cc: Xiaoyao Li Cc: Rick Edgecombe Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 20 +++++++++++++------- arch/x86/kvm/vmx/tdx.h | 2 +- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 326db9b9c567..2f3dfe9804b5 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -780,6 +780,14 @@ void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); vt->guest_state_loaded = true; + + /* + * Several of KVM's user-return MSRs are clobbered by the TDX-Module if + * VP.ENTER succeeds, i.e. on TD-Exit. Mark those MSRs as needing an + * update to synchronize the "current" value in KVM's cache with the + * value in hardware (loaded by the TDX-Module). + */ + to_tdx(vcpu)->need_user_return_msr_sync = true; } struct tdx_uret_msr { @@ -807,7 +815,6 @@ static void tdx_user_return_msr_update_cache(void) static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) { struct vcpu_vt *vt = to_vt(vcpu); - struct vcpu_tdx *tdx = to_tdx(vcpu); if (!vt->guest_state_loaded) return; @@ -815,11 +822,6 @@ static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) ++vcpu->stat.host_state_reload; wrmsrl(MSR_KERNEL_GS_BASE, vt->msr_host_kernel_gs_base); - if (tdx->guest_entered) { - tdx_user_return_msr_update_cache(); - tdx->guest_entered = false; - } - vt->guest_state_loaded = false; } @@ -1059,7 +1061,11 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags) update_debugctlmsr(vcpu->arch.host_debugctl); tdx_load_host_xsave_state(vcpu); - tdx->guest_entered = true; + + if (tdx->need_user_return_msr_sync) { + tdx_user_return_msr_update_cache(); + tdx->need_user_return_msr_sync = false; + } vcpu->arch.regs_avail &= TDX_REGS_AVAIL_SET; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index ca39a9391db1..9434a6371d67 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -67,7 +67,7 @@ struct vcpu_tdx { u64 vp_enter_ret; enum vcpu_tdx_state state; - bool guest_entered; + bool need_user_return_msr_sync; u64 map_gpa_next; u64 map_gpa_end; -- 2.51.0.858.gf9c4a03a3a-goog