When iomap_iter() finishes its iteration (returns <= 0), it is no longer necessary to memset the entire iomap and srcmap structures. In high-IOPS scenarios (like 4k randread NVMe polling with io_uring), where the majority of I/Os complete in a single extent map, this wasted memory write bandwidth, as the caller will just discard the iterator. Use this command to test: taskset -c 30 ./t/io_uring -p1 -d512 -b4096 -s32 -c32 -F1 -B1 -R1 -X1 -n1 -P1 /mnt/testfile IOPS improve about 5% on ext4 and XFS. However, we MUST still call iomap_iter_reset_iomap() to release the folio_batch if IOMAP_F_FOLIO_BATCH is set, otherwise we leak page references. Therefore, split the cleanup logic: always release the folio_batch, but skip the memset() when ret <= 0. Signed-off-by: Fengnan Chang --- fs/iomap/iter.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c index c04796f6e57f..e4a29829591a 100644 --- a/fs/iomap/iter.c +++ b/fs/iomap/iter.c @@ -6,17 +6,13 @@ #include #include "trace.h" -static inline void iomap_iter_reset_iomap(struct iomap_iter *iter) +static inline void iomap_iter_clean_fbatch(struct iomap_iter *iter) { if (iter->iomap.flags & IOMAP_F_FOLIO_BATCH) { folio_batch_release(iter->fbatch); folio_batch_reinit(iter->fbatch); iter->iomap.flags &= ~IOMAP_F_FOLIO_BATCH; } - - iter->status = 0; - memset(&iter->iomap, 0, sizeof(iter->iomap)); - memset(&iter->srcmap, 0, sizeof(iter->srcmap)); } /* Advance the current iterator position and decrement the remaining length */ @@ -102,10 +98,14 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops) ret = 0; else ret = 1; - iomap_iter_reset_iomap(iter); + iomap_iter_clean_fbatch(iter); + iter->status = 0; if (ret <= 0) return ret; + memset(&iter->iomap, 0, sizeof(iter->iomap)); + memset(&iter->srcmap, 0, sizeof(iter->srcmap)); + begin: ret = ops->iomap_begin(iter->inode, iter->pos, iter->len, iter->flags, &iter->iomap, &iter->srcmap); -- 2.39.5 (Apple Git-154)