From: Chi Zhiling When reading small amounts of data from the page cache, only a single folio is typically returned from filemap_read_get_batch(). In this case, calling xas_advance() or xas_next() after adding the folio to the batch is unnecessary and only introduces extra branches. The same issue exists for large reads, where one additional xarray walk is always performed before termination. Move the boundary check to after the folio is added to the batch so the final redundant xarray advancement can be avoided. This significantly reduces the branch count in the read path. xas_next() does not update xa_index when xas->xa_node is set to XAS_RESTART, so checking the boundary before updating xa_index is sufficient to keep the folio within range. The warning should therefore never trigger. The branch count: 654.198 M/sec -> 646.444 M/sec Performance counter stats for 'fio --ioengine=sync --rw=read --bs=4k --size=1G --runtime=300 --time_based --group_reporting --name=seq_read_test --filename=file': before: READ: bw=2697MiB/s (2828MB/s), 2697MiB/s-2697MiB/s (2828MB/s-2828MB/s), io=790GiB (848GB), run=300001-300001msec 245602051556 task-clock # 0.821 CPUs utilized 78467 context-switches # 319.488 /sec 40 cpu-migrations # 0.163 /sec 3388 page-faults # 13.795 /sec 758312319204 instructions # 0.74 insn per cycle 1025881497502 cycles # 4.177 GHz 160672383734 branches # 654.198 M/sec 361904512 branch-misses # 0.23% of all branches after: READ: bw=2709MiB/s (2841MB/s), 2709MiB/s-2709MiB/s (2841MB/s-2841MB/s), io=794GiB (852GB), run=300000-300000msec 243985503670 task-clock # 0.812 CPUs utilized 79004 context-switches # 323.806 /sec 30 cpu-migrations # 0.123 /sec 3355 page-faults # 13.751 /sec 747830935069 instructions # 0.73 insn per cycle 1019609333322 cycles # 4.179 GHz 157722976668 branches # 646.444 M/sec 348984893 branch-misses # 0.22% of all branches Signed-off-by: Chi Zhiling --- mm/filemap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 4e636647100c..d54450e529bd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2458,12 +2458,16 @@ static void filemap_get_read_batch(struct address_space *mapping, { XA_STATE(xas, &mapping->i_pages, index); struct folio *folio; + pgoff_t next; + + if (unlikely(index > max)) + return; rcu_read_lock(); for (folio = xas_load(&xas); folio; folio = xas_next(&xas)) { if (xas_retry(&xas, folio)) continue; - if (xas.xa_index > max || xa_is_value(folio)) + if (xa_is_value(folio) || WARN_ON(xas.xa_index > max)) break; if (xa_is_sibling(folio)) break; @@ -2479,7 +2483,11 @@ static void filemap_get_read_batch(struct address_space *mapping, break; if (folio_test_readahead(folio)) break; - xas_advance(&xas, folio_next_index(folio) - 1); + + next = folio_next_index(folio); + if (next > max) + break; + xas_advance(&xas, next - 1); continue; put_folio: folio_put(folio); -- 2.43.0