From: hexue In the existing plug mechanism, tags are allocated in batches based on the number of requests. However, testing has shown that the plug only attempts batch allocation of tags once at the beginning of a batch of I/O operations. Since the tag_mask does not always have enough available tags to satisfy the requested number, a full batch allocation is not guaranteed to succeed each time. The remaining tags are then allocated individually (occurs frequently), leading to multiple single-tag allocation overheads. This patch aims to allow the remaining I/O operations to retry batch allocation of tags, reducing the overhead caused by multiple individual tag allocations. ------------------------------------------------------------------------ test result During testing of the PCIe Gen4 SSD Samsung PM9A3, the perf tool observed CPU improvements. The CPU usage of the original function _blk_mq_alloc_requests function was 1.39%, which decreased to 0.82% after modification. Additionally, performance variations were observed on different devices. workload:randread blocksize:4k thread:1 ------------------------------------------------------------------------ PCIe Gen3 SSD PCIe Gen4 SSD PCIe Gen5 SSD native kernel 553k iops 633k iops 793k iops modified 553k iops 635k iops 801k iops with Optane SSDs, the performance like two device one thread cmd :sudo taskset -c 0 ./t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n1 -r4 /dev/nvme0n1 /dev/nvme1n1 base: 6.4 Million IOPS patch: 6.49 Million IOPS two device two thread cmd: sudo taskset -c 0 ./t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n1 -r4 /dev/nvme0n1 /dev/nvme1n1 base: 7.34 Million IOPS patch: 7.48 Million IOPS ------------------------------------------------------------------------- Signed-off-by: hexue --- block/blk-mq.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b67d6c02eceb..1fb280764b76 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -587,9 +587,9 @@ static struct request *blk_mq_rq_cache_fill(struct request_queue *q, if (blk_queue_enter(q, flags)) return NULL; - plug->nr_ios = 1; - rq = __blk_mq_alloc_requests(&data); + plug->nr_ios = data.nr_tags; + if (unlikely(!rq)) blk_queue_exit(q); return rq; @@ -3034,11 +3034,13 @@ static struct request *blk_mq_get_new_requests(struct request_queue *q, if (plug) { data.nr_tags = plug->nr_ios; - plug->nr_ios = 1; data.cached_rqs = &plug->cached_rqs; } rq = __blk_mq_alloc_requests(&data); + if (plug) + plug->nr_ios = data.nr_tags; + if (unlikely(!rq)) rq_qos_cleanup(q, bio); return rq; -- 2.34.1