From: Thomas Huth When you run a KVM guest with vhost-net and migrate that guest to another host, and you immediately enable postcopy after starting the migration, there is a big chance that the network connection of the guest won't work anymore on the destination side after the migration. With a debug kernel v6.16.0, there is also a call trace that looks like this: FAULT_FLAG_ALLOW_RETRY missing 881 CPU: 6 UID: 0 PID: 549 Comm: kworker/6:2 Kdump: loaded Not tainted 6.16.0 #56 NONE Hardware name: IBM 3931 LA1 400 (LPAR) Workqueue: events irqfd_inject [kvm] Call Trace: [<00003173cbecc634>] dump_stack_lvl+0x104/0x168 [<00003173cca69588>] handle_userfault+0xde8/0x1310 [<00003173cc756f0c>] handle_pte_fault+0x4fc/0x760 [<00003173cc759212>] __handle_mm_fault+0x452/0xa00 [<00003173cc7599ba>] handle_mm_fault+0x1fa/0x6a0 [<00003173cc73409a>] __get_user_pages+0x4aa/0xba0 [<00003173cc7349e8>] get_user_pages_remote+0x258/0x770 [<000031734be6f052>] get_map_page+0xe2/0x190 [kvm] [<000031734be6f910>] adapter_indicators_set+0x50/0x4a0 [kvm] [<000031734be7f674>] set_adapter_int+0xc4/0x170 [kvm] [<000031734be2f268>] kvm_set_irq+0x228/0x3f0 [kvm] [<000031734be27000>] irqfd_inject+0xd0/0x150 [kvm] [<00003173cc00c9ec>] process_one_work+0x87c/0x1490 [<00003173cc00dda6>] worker_thread+0x7a6/0x1010 [<00003173cc02dc36>] kthread+0x3b6/0x710 [<00003173cbed2f0c>] __ret_from_fork+0xdc/0x7f0 [<00003173cdd737ca>] ret_from_fork+0xa/0x30 3 locks held by kworker/6:2/549: #0: 00000000800bc958 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7ee/0x1490 #1: 000030f3d527fbd0 ((work_completion)(&irqfd->inject)){+.+.}-{0:0}, at: process_one_work+0x81c/0x1490 #2: 00000000f99862b0 (&mm->mmap_lock){++++}-{3:3}, at: get_map_page+0xa8/0x190 [kvm] The "FAULT_FLAG_ALLOW_RETRY missing" indicates that handle_userfaultfd() saw a page fault request without ALLOW_RETRY flag set, hence userfaultfd cannot remotely resolve it (because the caller was asking for an immediate resolution, aka, FAULT_FLAG_NOWAIT, while remote faults can take time). With that, get_map_page() failed and the irq was lost. We should not be strictly in an atomic environment here and the worker should be sleepable (the call is done during an ioctl from userspace), so we can allow adapter_indicators_set() to just sleep waiting for the remote fault instead. Link: https://issues.redhat.com/browse/RHEL-42486 Signed-off-by: Peter Xu [thuth: Assembled patch description and fixed some cosmetical issues] Signed-off-by: Thomas Huth --- Note: Instructions for reproducing the bug can be found in the ticket here: https://issues.redhat.com/browse/RHEL-42486?focusedId=26661116#comment-26661116 arch/s390/kvm/interrupt.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 60c360c18690f..dcce826ae9875 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -2777,12 +2777,19 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap) static struct page *get_map_page(struct kvm *kvm, u64 uaddr) { + struct mm_struct *mm = kvm->mm; struct page *page = NULL; + int locked = 1; + + if (mmget_not_zero(mm)) { + mmap_read_lock(mm); + get_user_pages_remote(mm, uaddr, 1, FOLL_WRITE, + &page, &locked); + if (locked) + mmap_read_unlock(mm); + mmput(mm); + } - mmap_read_lock(kvm->mm); - get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE, - &page, NULL); - mmap_read_unlock(kvm->mm); return page; } -- 2.50.1