We added an early exit in thp_underused(), probably to avoid scanning pages when there is no chance for success. However, assume we have max_ptes_none = 511 (default). Nothing should stop us from freeing all pages part of a THP that is completely zero (512) and khugepaged will for sure not try to instantiate a THP in that case (512 shared zeropages). This can just trivially happen if someone writes a single 0 byte into a PMD area, or of course, when data ends up being zero later. So let's remove that early exit. Do we want to CC stable? Hm, not sure. Probably not urgent. Note that, as default, the THP shrinker is active (/sys/kernel/mm/transparent_hugepage/shrink_underused = 1), and all THPs are added to the deferred split lists. However, with the max_ptes_none default we would never scan them. We would not do that. If that's not desirable, we should just disable the shrinker as default, also not adding all THPs to the deferred split lists. Easy to reproduce: 1) Allocate some THPs filled with 0s #include #include #include #include #include const size_t size = 1024*1024*1024; int main(void) { size_t offs; char *area; area = mmap(0, size, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0); if (area == MAP_FAILED) { printf("mmap failed\n"); exit(-1); } madvise(area, size, MADV_HUGEPAGE); for (offs = 0; offs < size; offs += getpagesize()) area[offs] = 0; pause(); } <\prog.c> 2) Trigger the shrinker E.g., memory pressure through memhog 3) Observe that THPs are not getting reclaimed $ cat /proc/`pgrep prog`/smaps_rollup Would list ~1GiB of AnonHugePages. With this fix, they would get reclaimed as expected. Fixes: dafff3f4c850 ("mm: split underused THPs") Cc: Andrew Morton Cc: Lorenzo Stoakes Cc: Zi Yan Cc: Baolin Wang Cc: "Liam R. Howlett" Cc: Nico Pache Cc: Ryan Roberts Cc: Dev Jain Cc: Barry Song Cc: Usama Arif Signed-off-by: David Hildenbrand --- mm/huge_memory.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 26cedfcd74189..aa3ed7a86435b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -4110,9 +4110,6 @@ static bool thp_underused(struct folio *folio) void *kaddr; int i; - if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1) - return false; - for (i = 0; i < folio_nr_pages(folio); i++) { kaddr = kmap_local_folio(folio, i * PAGE_SIZE); if (!memchr_inv(kaddr, 0, PAGE_SIZE)) { -- 2.50.1