When __blk_mq_update_nr_hw_queues() increases nr_hw_queues, it first pre-allocates scheduler resources via blk_mq_alloc_sched_res_batch() with the new (larger) queue count, then removes the elevator via blk_mq_elv_switch_none(), and finally expands set->tags via blk_mq_prealloc_tag_set_tags(). If blk_mq_prealloc_tag_set_tags() fails, the code jumps to switch_back which re-attaches the elevator using the pre-allocated resources (ctx->res.et) whose nr_hw_queues is the new larger value. But set->tags was never expanded, so when the elevator is later freed, blk_mq_free_sched_tags() iterates et->nr_hw_queues entries and blk_mq_free_rqs() accesses set->tags[hctx_idx] beyond its allocated size, causing a slab-out-of-bounds. Fix this by moving blk_mq_prealloc_tag_set_tags() before blk_mq_elv_switch_none(). If the tags allocation fails, the elevator has not been removed yet, so we simply free the pre-allocated scheduler resources and exit cleanly without entering the switch_back path. Note that blk_mq_elv_switch_none() practically cannot fail today (it only fails via WARN_ON_ONCE on xa_load returning NULL), so blk_mq_prealloc_tag_set_tags() is the only realistic failure point that could reach switch_back. Moving it before the elevator removal eliminates the problematic path entirely. Cc: Jiayuan Chen Signed-off-by: Jiayuan Chen --- block/blk-mq.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 3da2215b2912..2183142b4568 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -5141,6 +5141,18 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (blk_mq_alloc_sched_res_batch(&elv_tbl, set, nr_hw_queues) < 0) goto out_free_ctx; + /* + * Pre-allocate the new tags array before removing the elevator, + * so that if allocation fails we can exit cleanly without having + * modified the elevator state. + */ + new_tags = blk_mq_prealloc_tag_set_tags(set, nr_hw_queues); + if (IS_ERR(new_tags)) { + new_tags = NULL; + blk_mq_free_sched_res_batch(&elv_tbl, set); + goto out_free_ctx; + } + list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_debugfs_unregister_hctxs(q); blk_mq_sysfs_unregister_hctxs(q); @@ -5155,10 +5167,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (blk_mq_elv_switch_none(q, &elv_tbl)) goto switch_back; - new_tags = blk_mq_prealloc_tag_set_tags(set, nr_hw_queues); - if (IS_ERR(new_tags)) - goto switch_back; - list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_freeze_queue_nomemsave(q); queues_frozen = true; -- 2.43.0