From: Max Boone A race between page table walking (e.g. via procfs numa_maps) and VFIO DMA pinning can lead to temporary failures in follow_pfnmap_start(). When a PUD entry is split and concurrently refaulted, the PFNMAP mapping may be temporarily zapped, causing follow_pfnmap_start() to return an error. Although follow_pfnmap_start() returns an -EINVAL this is not due to invalid parameters, but rather because of the pfnmap being non-present. Treat it as such, and retry by returning -EAGAIN, similar to how GUP handles such races. This avoids propagating an unexpected -EINVAL to userspace, like follows: [dma_map] dma_map iova=0x000000000000 size=0x000004000000 vaddr=0x00007f7800000000 dma_map FAILED iova=0x020000000000: [Errno 22] Invalid argument dma_map iova=0x040000000000 size=0x000002000000 vaddr=0x00007f5780000000 Which would've succeeded on a retry. Cc: stable@vger.kernel.org Fixes: a77f9489f1d7 ("vfio: use the new follow_pfnmap API") Signed-off-by: Max Boone --- drivers/vfio/vfio_iommu_type1.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 5167bec14..3a0d0bbb9 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -559,9 +559,17 @@ static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, if (ret) return ret; + /* + * follow_pfnmap_start() returns -EINVAL for + * invalid parameters and non-present entries. + * If that happens here after a successful + * fixup_user_fault(), it is likely that the + * pfnmap has been zapped. Retry instead of + * failing. + */ ret = follow_pfnmap_start(&args); if (ret) - return ret; + return -EAGAIN; } if (write_fault && !args.writable) { --- base-commit: 96ca4caf9066f5ebd35b561a521af588a8eb0215 change-id: 20260317-retry-pin-on-reclaimed-pud-dfb9e26eb8cf Best regards, -- Max Boone