The current mechanism for determining mTHP collapse scales the khugepaged_max_ptes_none value based on the target order. This introduces an undesirable feedback loop, or "creep", when max_ptes_none is set to a value greater than HPAGE_PMD_NR / 2. With this configuration, a successful collapse to order N will populate enough pages to satisfy the collapse condition on order N+1 on the next scan. This leads to unnecessary work and memory churn. To fix this issue introduce a helper function that will limit mTHP collapse support to two max_ptes_none values, 0 and HPAGE_PMD_NR - 1. This effectively supports two modes: - max_ptes_none=0: never introduce new none-pages for mTHP collapse. - max_ptes_none=511 (on 4k pagesz): Always collapse to the highest available mTHP order. This removes the possiblilty of "creep", while not modifying any uAPI expectations. A warning will be emitted if any non-supported max_ptes_none value is configured with mTHP enabled. Reviewed-by: Lorenzo Stoakes Reviewed-by: Baolin Wang Signed-off-by: Nico Pache --- mm/khugepaged.c | 40 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ecdbbf6a01a6..99f78f0e44c6 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -456,6 +456,36 @@ void __khugepaged_enter(struct mm_struct *mm) wake_up_interruptible(&khugepaged_wait); } +/** + * collapse_max_ptes_none - Calculate maximum allowed empty PTEs for collapse + * @order: The folio order being collapsed to + * + * For PMD-sized collapses (order == HPAGE_PMD_ORDER), use the configured + * khugepaged_max_ptes_none value. + * + * For mTHP collapses, we currently only support khugepaged_max_pte_none values + * of 0 or (COLLAPSE_MAX_PTES_LIMIT). Any other value will emit a warning and + * no mTHP collapse will be attempted + * + * Return: Maximum number of empty PTEs allowed for the collapse operation + */ +static unsigned int collapse_max_ptes_none(unsigned int order) +{ + if (is_pmd_order(order)) + return khugepaged_max_ptes_none; + + /* Zero/non-present collapse disabled. */ + if (!khugepaged_max_ptes_none) + return 0; + + if (khugepaged_max_ptes_none == COLLAPSE_MAX_PTES_LIMIT) + return (1 << order) - 1; + + pr_warn_once("mTHP collapse only supports max_ptes_none values of 0 or %u\n", + COLLAPSE_MAX_PTES_LIMIT); + return -EINVAL; +} + void khugepaged_enter_vma(struct vm_area_struct *vma, vm_flags_t vm_flags) { @@ -541,10 +571,18 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma, struct folio *folio = NULL; unsigned long addr = start_addr; pte_t *_pte; + int max_ptes_none; int none_or_zero = 0, shared = 0, referenced = 0; enum scan_result result = SCAN_FAIL; const unsigned long nr_pages = 1UL << order; - int max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); + + if (cc->is_khugepaged) + max_ptes_none = collapse_max_ptes_none(order); + else + max_ptes_none = COLLAPSE_MAX_PTES_LIMIT; + + if (max_ptes_none == -EINVAL) + return result; for (_pte = pte; _pte < pte + nr_pages; _pte++, addr += PAGE_SIZE) { -- 2.53.0