kvm_xen_set_evtchn_fast() calls read_lock_irqsave(), which might block
on PREEMPT_RT, but that is invalid in IRQ context, as when it's called
by xen_timer_callback() (even on PREEMPT_RT per HRTIMER_MODE_ABS_HARD).
Check for that case, and bail out early.
Note: there is previous work and discussion on this [1] (~2 years ago),
which involved continuing to execute the function with changes, but it
was not merged. That was a different, more complex approach.
[1] https://lore.kernel.org/lkml/ZdPQVP7eejq3eFjc@google.com/
This is quickly hit while booting a Xen guest in a KVM Xen host.
With this patch, it boots quietly and runs timer stress without issues
(e.g., stress-ng --quiet --timer 1 --timer-freq 19000 --timer-slack 0).
Tested with/without CONFIG_PREEMPT_RT.
Test case:
=========
Configure a host kernel (CONFIG_KVM_XEN) like,
$ make x86_64_defconfig
$ ./scripts/config \
-e EXPERT -e PREEMPT_RT -e DEBUG_ATOMIC_SLEEP \
-e KVM -e KVM_INTEL -e KVM_AMD -e KVM_XEN
$ make olddefconfig
and boot a Xen guest kernel (CONFIG_XEN) with:
# qemu-system-x86_64 \
-accel kvm,xen-version=0x40011,kernel-irqchip=split \
-cpu host,+xen-vapic -smp 1 -m 1024 \
-nodefaults -nographic -serial stdio \
-kernel arch/x86/boot/bzImage -append 'console=ttyS0'
See dmesg in the host:
[ 27.643129] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:231
[ 27.643134] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 284, name: qemu-system-x86
[ 27.643137] preempt_count: 10000, expected: 0
[ 27.643138] RCU nest depth: 0, expected: 0
[ 27.643146] CPU: 1 UID: 0 PID: 284 Comm: qemu-system-x86 Not tainted 7.1.0-rc2 #5 PREEMPT_{RT,(lazy)}
[ 27.643150] Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 27.643152] Call Trace:
[ 27.643155]
[ 27.643157] dump_stack_lvl+0x64/0x80
[ 27.643165] __might_resched+0x131/0x180
[ 27.643171] rt_read_lock+0x47/0x210
[ 27.643176] kvm_xen_set_evtchn_fast+0xa5/0x3f0
[ 27.643184] xen_timer_callback+0x88/0xc0
[ 27.643188] __hrtimer_run_queues+0x10b/0x280
[ 27.643193] hrtimer_interrupt+0xf6/0x1b0
[ 27.643196] __sysvec_apic_timer_interrupt+0x55/0x130
[ 27.643200] sysvec_apic_timer_interrupt+0x39/0x80
[ 27.643204] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 27.643208] RIP: 0033:0x7f069721a8db
...
[ 27.643226]
Reported-by: syzbot+208f7f3e5f59c11aeb90@syzkaller.appspotmail.com
Closes: https://syzbot.org/bug?extid=208f7f3e5f59c11aeb90
Signed-off-by: Mauricio Faria de Oliveira
---
arch/x86/kvm/xen.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 91fd3673c09a2ef3dc154050e01df608182e59e5..76782191043b56c581f89c3861979236662cdbd7 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1814,6 +1814,10 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
rc = -EWOULDBLOCK;
+ /* Bail in IRQ context on PREEMPT_RT; read_lock_irqsave() might block */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && in_hardirq())
+ goto out;
+
idx = srcu_read_lock(&kvm->srcu);
read_lock_irqsave(&gpc->lock, flags);
@@ -1892,6 +1896,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
kvm_vcpu_kick(vcpu);
}
+ out:
return rc;
}
---
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
change-id: 20260506-xen-rt-sleep-e71b92097f19
Best regards,
--
Mauricio Faria de Oliveira