During system suspend, wakeup capable IRQs for block device can be delayed, which can cause blk_mq_hctx_notify_offline() to hang indefinitely while waiting for pending request to complete. Skip the request waiting loop and abort suspend when wakeup events are pending to prevent the deadlock. Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline") Signed-off-by: Cong Zhang --- The issue was found during system suspend with a no_soft_reset virtio-blk device. Here is the detailed analysis: - When system suspend starts and no_soft_reset is enabled, virtio-blk does not call its suspend callback. - Some requests are dispatched and queued. After sending the virtqueue notifier, the kernel waits for an IRQ to complete the request. - The virtio-blk IRQ is wakeup-capable. When the IRQ is triggered, it remains pending because the device is in the suspend process. - While checking blk_mq_hctx_has_requests(), it detects that there are still pending requests. - Since there is no way to complete these requests, the kernel gets stuck in the CPU hotplug thread. We believe this could be a common issue. If the kernel enters the blk_mq_hctx_has_requests() loop during suspend, wakeup-capable IRQs cannot be processed, which can lead to a deadlock in this scenario. This also improves the latency for wakup-capable IRQs. If a non-block wakeup IRQ is pending, suspend is going to be abort anyway after this step. Returning early avoids unnecessary delay and improve the suspend latency. --- block/blk-mq.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index d626d32f6e576f95bc68495c467a9d9c7b73a581..0cf83c2d406609181d430df163cdf2e6ef4f7c18 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -3707,6 +3708,7 @@ static int blk_mq_hctx_notify_offline(unsigned int cpu, struct hlist_node *node) { struct blk_mq_hw_ctx *hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp_online); + int ret = 0; if (blk_mq_hctx_has_online_cpu(hctx, cpu)) return 0; @@ -3727,12 +3729,18 @@ static int blk_mq_hctx_notify_offline(unsigned int cpu, struct hlist_node *node) * frozen and there are no requests. */ if (percpu_ref_tryget(&hctx->queue->q_usage_counter)) { - while (blk_mq_hctx_has_requests(hctx)) + while (blk_mq_hctx_has_requests(hctx)) { + if (pm_wakeup_pending()) { + clear_bit(BLK_MQ_S_INACTIVE, &hctx->state); + ret = -EBUSY; + break; + } msleep(5); + } percpu_ref_put(&hctx->queue->q_usage_counter); } - return 0; + return ret; } /* --- base-commit: e538109ac71d801d26776af5f3c54f548296c29c change-id: 20251128-blkmq_skip_waiting-732dab95acdb Best regards, -- Cong Zhang