Under CONFIG_PER_VMA_LOCK, mfill_atomic() holds only a per-VMA read lock across its page-by-page copy loop. A concurrent UFFDIO_UNREGISTER can acquire mmap_write_lock() and split the VMA mid-loop via __split_vma(), which calls vma_start_write() via __vma_enter_locked(). The split happens in the race window between CHECK 1 and vm_refcnt++ in vma_start_read(). During this window vm_refcnt equals the base attached value, so vma_start_write() sees no readers and proceeds immediately without waiting, shrinking vma->vm_end in place. Both seqnum checks in vma_start_read() miss this because after mmap_write_unlock(), mm_lock_seq is incremented past vm_lock_seq making them unequal, so a split VMA is returned to mfill_atomic(). On the next iteration, mfill_atomic_install_pte() calls folio_add_new_anon_rmap() with state.dst_addr >= vma->vm_end, triggering its sanity check: address < vma->vm_start || address + (nr << 12) > vma->vm_end WARNING: mm/rmap.c:1682 folio_add_new_anon_rmap+0x5fe/0x14b0 Fix this by checking on each loop iteration whether state.dst_addr has fallen outside state.vma. If so, release the stale vma, update dst_start and len to reflect the current position, and re-lookup the vma via mfill_get_vma(). Reported-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e24a2e34fad0efbac047" Tested-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey --- mm/userfaultfd.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9ffc80d0a51b..ab73c2106c38 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -910,6 +910,22 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, while (state.src_addr < src_start + len) { VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len); + /* + * Under CONFIG_PER_VMA_LOCK, a concurrent UFFDIO_UNREGISTER can + * split state.vma while we hold only the per-VMA read lock. The + * split shrinks vma->vm_end in place, causing dst_addr to fall + * outside the VMA bounds. Re-validate dst_addr on each iteration + * and re-lookup the vma if it has been split. + */ + if (state.dst_addr < state.vma->vm_start || + state.dst_addr >= state.vma->vm_end) { + mfill_put_vma(&state); + state.dst_start = state.dst_addr; + state.len = dst_start + len - state.dst_addr; + err = mfill_get_vma(&state); + if (err) + break; + } err = mfill_get_pmd(&state); if (err) -- 2.43.0