From: "Kirill A. Shutemov" Call tdx_free_page() and tdx_pamt_put() on the paths that free TDX pages. The PAMT memory holds metadata for TDX-protected memory. With Dynamic PAMT, PAMT_4K is allocated on demand. The kernel supplies the TDX module with a few pages that cover 2M of host physical memory. PAMT memory can be reclaimed when the last user is gone. It can happen in a few code paths: - On TDH.PHYMEM.PAGE.RECLAIM in tdx_reclaim_td_control_pages() and tdx_reclaim_page(). - On TDH.MEM.PAGE.REMOVE in tdx_sept_drop_private_spte(). - In tdx_sept_zap_private_spte() for pages that were in the queue to be added with TDH.MEM.PAGE.ADD, but it never happened due to an error. - In tdx_sept_free_private_spt() for SEPT pages; Add tdx_pamt_put() for memory that comes from guest_memfd and use tdx_free_page() for the rest. Signed-off-by: Kirill A. Shutemov [Minor log tweak] Signed-off-by: Rick Edgecombe --- v4: - Rebasing on post-populate series required some changes on how PAMT refcounting was handled in the KVM_TDX_INIT_MEM_REGION path. Now instead of incrementing DPAMT refcount on the fake add in the fault path, it only increments it when tdh_mem_page_add() actually succeeds, like in tdx_mem_page_aug(). Because of this, the special handling for the case tdx_is_sept_zap_err_due_to_premap() cared about is unneeded. v3: - Minor log tweak to conform kvm/x86 style. --- arch/x86/kvm/vmx/tdx.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 24322263ac27..f8de50e7dc7f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -360,7 +360,7 @@ static void tdx_reclaim_control_page(struct page *ctrl_page) if (tdx_reclaim_page(ctrl_page)) return; - __free_page(ctrl_page); + tdx_free_page(ctrl_page); } struct tdx_flush_vp_arg { @@ -597,7 +597,7 @@ static void tdx_reclaim_td_control_pages(struct kvm *kvm) tdx_quirk_reset_page(kvm_tdx->td.tdr_page); - __free_page(kvm_tdx->td.tdr_page); + tdx_free_page(kvm_tdx->td.tdr_page); kvm_tdx->td.tdr_page = NULL; } @@ -1827,6 +1827,8 @@ static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct page *page = virt_to_page(private_spt); + int ret; /* * free_external_spt() is only called after hkid is freed when TD is @@ -1843,7 +1845,12 @@ static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, * The HKID assigned to this TD was already freed and cache was * already flushed. We don't have to flush again. */ - return tdx_reclaim_page(virt_to_page(private_spt)); + ret = tdx_reclaim_page(virt_to_page(private_spt)); + if (ret) + return ret; + + tdx_pamt_put(page); + return 0; } static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, @@ -1895,6 +1902,7 @@ static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, return; tdx_quirk_reset_page(page); + tdx_pamt_put(page); } void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, -- 2.51.2