When iomap_iter() finishes its iteration (returns <= 0), it is no longer necessary to memset the entire iomap and srcmap structures. In high-IOPS scenarios (like 4k randread NVMe polling with io_uring), where the majority of I/Os complete in a single extent map, this wasted memory write bandwidth, as the caller will just discard the iterator. Use this command to test: taskset -c 30 ./t/io_uring -p1 -d512 -b4096 -s32 -c32 -F1 -B1 -R1 -X1 -n1 -P1 /mnt/testfile IOPS improve about 5% on ext4 and XFS. However, we MUST still call iomap_iter_reset_iomap() to release the folio_batch if IOMAP_F_FOLIO_BATCH is set, otherwise we leak page references. Therefore, split the cleanup logic: always release the folio_batch, but skip the memset() when ret <= 0. Signed-off-by: Fengnan Chang --- fs/iomap/iter.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c index c04796f6e57f..91eb5e6165ff 100644 --- a/fs/iomap/iter.c +++ b/fs/iomap/iter.c @@ -15,8 +15,6 @@ static inline void iomap_iter_reset_iomap(struct iomap_iter *iter) } iter->status = 0; - memset(&iter->iomap, 0, sizeof(iter->iomap)); - memset(&iter->srcmap, 0, sizeof(iter->srcmap)); } /* Advance the current iterator position and decrement the remaining length */ @@ -106,6 +104,9 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops) if (ret <= 0) return ret; + memset(&iter->iomap, 0, sizeof(iter->iomap)); + memset(&iter->srcmap, 0, sizeof(iter->srcmap)); + begin: ret = ops->iomap_begin(iter->inode, iter->pos, iter->len, iter->flags, &iter->iomap, &iter->srcmap); -- 2.39.5 (Apple Git-154)