TDX Module updates require careful synchronization with other TDX operations on the host. During updates, only update-related SEAMCALLs are permitted; all other SEAMCALLs must be blocked. However, SEAMCALLs can be invoked from different contexts (normal and IRQ context) and run in parallel across CPUs. And, all TD vCPUs must remain out of guest mode during updates. No single lock primitive can satisfy all these synchronization requirements, so stop_machine() is used as the only well-understood mechanism that can meet them all. The TDX Module update process consists of several steps as described in IntelĀ® Trust Domain Extensions (IntelĀ® TDX) Module Base Architecture Specification, Revision 348549-007, Chapter 4.5 "TD-Preserving TDX Module Update" - shut down the old module - install the new module - global and per-CPU initialization - restore state information Some steps must execute on a single CPU, others must run serially across all CPUs, and some can run concurrently on all CPUs. There are also ordering requirements between steps, so all CPUs must work in a step-locked manner. In summary, TDX Module updates create two requirements: 1. The entire update process must use stop_machine() to synchronize with other TDX workloads 2. Update steps must be performed in a step-locked manner To prepare for implementing concrete TDX Module update steps, establish the framework by mimicking multi_cpu_stop(), which is a good example of performing a multi-step task in step-locked manner. Specifically, use a global state machine to control each CPU's work and require all CPUs to acknowledge completion before proceeding to the next step. Potential alternative to stop_machine() ======================================= An alternative approach is to lock all KVM entry points and kick all vCPUs. Here, KVM entry points refer to KVM VM/vCPU ioctl entry points, implemented in KVM common code (virt/kvm). Adding a locking mechanism there would affect all architectures KVM supports. And to lock only TDX vCPUs, new logic would be needed to identify TDX vCPUs, which the KVM common code currently lacks. This would add significant complexity and maintenance overhead to KVM for this TDX-specific use case. Signed-off-by: Chao Gao Reviewed-by: Xu Yilun Reviewed-by: Tony Lindgren --- v2: - refine the changlog to follow context-problem-solution structure - move alternative discussions at the end of the changelog - add a comment about state machine transition - Move rcu_momentary_eqs() call to the else branch. --- arch/x86/virt/vmx/tdx/seamldr.c | 70 ++++++++++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/arch/x86/virt/vmx/tdx/seamldr.c b/arch/x86/virt/vmx/tdx/seamldr.c index 718cb8396057..21d572d75769 100644 --- a/arch/x86/virt/vmx/tdx/seamldr.c +++ b/arch/x86/virt/vmx/tdx/seamldr.c @@ -10,8 +10,10 @@ #include #include #include +#include #include #include +#include #include @@ -186,6 +188,68 @@ static struct seamldr_params *init_seamldr_params(const u8 *data, u32 size) return alloc_seamldr_params(module, module_size, sig, sig_size); } +/* + * During a TDX Module update, all CPUs start from TDP_START and progress + * to TDP_DONE. Each state is associated with certain work. For some + * states, just one CPU needs to perform the work, while other CPUs just + * wait during those states. + */ +enum tdp_state { + TDP_START, + TDP_DONE, +}; + +static struct { + enum tdp_state state; + atomic_t thread_ack; +} tdp_data; + +static void set_target_state(enum tdp_state state) +{ + /* Reset ack counter. */ + atomic_set(&tdp_data.thread_ack, num_online_cpus()); + /* Ensure thread_ack is updated before the new state */ + smp_wmb(); + WRITE_ONCE(tdp_data.state, state); +} + +/* Last one to ack a state moves to the next state. */ +static void ack_state(void) +{ + if (atomic_dec_and_test(&tdp_data.thread_ack)) + set_target_state(tdp_data.state + 1); +} + +/* + * See multi_cpu_stop() from where this multi-cpu state-machine was + * adopted, and the rationale for touch_nmi_watchdog() + */ +static int do_seamldr_install_module(void *params) +{ + enum tdp_state newstate, curstate = TDP_START; + int ret = 0; + + do { + /* Chill out and re-read tdp_data */ + cpu_relax(); + newstate = READ_ONCE(tdp_data.state); + + if (newstate != curstate) { + curstate = newstate; + switch (curstate) { + default: + break; + } + ack_state(); + } else { + touch_nmi_watchdog(); + rcu_momentary_eqs(); + } + } while (curstate != TDP_DONE); + + return ret; +} + DEFINE_FREE(free_seamldr_params, struct seamldr_params *, if (!IS_ERR_OR_NULL(_T)) free_seamldr_params(_T)) @@ -223,7 +287,11 @@ int seamldr_install_module(const u8 *data, u32 size) return -EBUSY; } - /* TODO: Update TDX Module here */ + set_target_state(TDP_START + 1); + ret = stop_machine_cpuslocked(do_seamldr_install_module, params, cpu_online_mask); + if (ret) + return ret; + return 0; } EXPORT_SYMBOL_FOR_MODULES(seamldr_install_module, "tdx-host"); -- 2.47.3