When init_on_alloc is enabled, kernel_init_pages() clears every page one at a time via clear_highpage_kasan_tagged(), which incurs per-page kmap_local_page()/kunmap_local() overhead and prevents the architecture clearing primitive from operating on contiguous ranges. Introduce clear_highpages_kasan_tagged() as a static batch clearing helper in page_alloc.c that calls clear_pages() for the full contiguous range on !HIGHMEM systems, bypassing the per-page kmap overhead and allowing a single invocation of the arch clearing primitive across the entire allocation. The HIGHMEM path falls back to per-page clearing since those pages require kmap. Replace kernel_init_pages() with direct calls to the new helper, as it becomes a trivial wrapper. Allocating 8192 x 2MB HugeTLB pages (16GB) with init_on_alloc=1: Before: 0.445s After: 0.166s (-62.7%, 2.68x faster) Kernel time (sys) reduction per workload with init_on_alloc=1: Workload Before After Change Graph500 64C128T 30m 41.8s 15m 14.8s -50.3% Graph500 16C32T 15m 56.7s 9m 43.7s -39.0% Pagerank 32T 1m 58.5s 1m 12.8s -38.5% Pagerank 128T 2m 36.3s 1m 40.4s -35.7% Signed-off-by: Hrushikesh Salunke Acked-by: Vlastimil Babka (SUSE) Acked-by: Zi Yan Acked-by: Pankaj Gupta --- Hi Andrew, This is v4 of the batch page clearing patch. v3 is already in mm-unstable, please replace it with this one. The only change is moving clear_highpages_kasan_tagged() from include/linux/highmem.h to mm/page_alloc.c as a static function, addressing the code size concern you raised on ARM allmodconfig. Thanks, Hrushikesh base commit: 2bcc13c29c711381d815c1ba5d5b25737400c71a v3: https://lore.kernel.org/all/20260422102729.166599-1-hsalunke@amd.com/ v2: https://lore.kernel.org/all/20260421042451.76918-1-hsalunke@amd.com/ v1: https://lore.kernel.org/all/20260408092441.435133-1-hsalunke@amd.com/ Changes since v3: - Moved clear_highpages_kasan_tagged() from include/linux/highmem.h to mm/page_alloc.c as a static function to avoid code size increase. As the function is only used within page_alloc.c. Changes since v2: - Moved kasan_disable_current()/kasan_enable_current() into clear_highpages_kasan_tagged(), per David and Zi Yan's suggestion. - Removed kernel_init_pages() and replaced its two call sites with direct calls to the helper. Changes since v1: - Dropped cond_resched() and PROCESS_PAGES_NON_PREEMPT_BATCH as kernel_init_pages() runs inside the page allocator and can be called from atomic context, making cond_resched() unsafe. The original code never had a cond_resched() here, and the performance gain comes from batching, not rescheduling. - Moved the !HIGHMEM/HIGHMEM branching into a new clear_highpages_kasan_tagged() helper in highmem.h, per David's suggestion. mm/page_alloc.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 65e205111553..3a59577f58a5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1208,14 +1208,18 @@ static inline bool should_skip_kasan_poison(struct page *page) return page_kasan_tag(page) == KASAN_TAG_KERNEL; } -static void kernel_init_pages(struct page *page, int numpages) +static void clear_highpages_kasan_tagged(struct page *page, int numpages) { - int i; - /* s390's use of memset() could override KASAN redzones. */ kasan_disable_current(); - for (i = 0; i < numpages; i++) - clear_highpage_kasan_tagged(page + i); + if (!IS_ENABLED(CONFIG_HIGHMEM)) { + clear_pages(kasan_reset_tag(page_address(page)), numpages); + } else { + int i; + + for (i = 0; i < numpages; i++) + clear_highpage_kasan_tagged(page + i); + } kasan_enable_current(); } @@ -1428,7 +1432,7 @@ __always_inline bool __free_pages_prepare(struct page *page, init = false; } if (init) - kernel_init_pages(page, 1 << order); + clear_highpages_kasan_tagged(page, 1 << order); /* * arch_free_page() can make the page's contents inaccessible. s390 @@ -1853,7 +1857,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, } /* If memory is still not initialized, initialize it now. */ if (init) - kernel_init_pages(page, 1 << order); + clear_highpages_kasan_tagged(page, 1 << order); set_page_owner(page, order, gfp_flags); page_table_check_alloc(page, order); -- 2.43.0