From: Lance Yang The hugepage collapsing code needs is_guard_pte_marker() to correctly handle PTE guard markers. Move the helper to a shared header and expose it. While at it, simplify the implementation. The current code is redundant as it effectively expands to: is_swap_pte(pte) && is_pte_marker_entry(...) && // from is_pte_marker() is_pte_marker_entry(...) // from is_guard_swp_entry() While a modern compiler could likely optimize this away, let's have clean code and not rely on it. Cc: Kairui Song Acked-by: David Hildenbrand Reviewed-by: Lorenzo Stoakes Signed-off-by: Lance Yang --- include/linux/swapops.h | 6 ++++++ mm/madvise.c | 6 ------ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 64ea151a7ae3..7475324c7757 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -469,6 +469,12 @@ static inline int is_guard_swp_entry(swp_entry_t entry) (pte_marker_get(entry) & PTE_MARKER_GUARD); } +static inline bool is_guard_pte_marker(pte_t ptent) +{ + return is_swap_pte(ptent) && + is_guard_swp_entry(pte_to_swp_entry(ptent)); +} + /* * This is a special version to check pte_none() just to cover the case when * the pte is a pte marker. It existed because in many cases the pte marker diff --git a/mm/madvise.c b/mm/madvise.c index 35ed4ab0d7c5..bd46e6788fac 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1069,12 +1069,6 @@ static bool is_valid_guard_vma(struct vm_area_struct *vma, bool allow_locked) return !(vma->vm_flags & disallowed); } -static bool is_guard_pte_marker(pte_t ptent) -{ - return is_pte_marker(ptent) && - is_guard_swp_entry(pte_to_swp_entry(ptent)); -} - static int guard_install_pud_entry(pud_t *pud, unsigned long addr, unsigned long next, struct mm_walk *walk) { -- 2.49.0 From: Lance Yang Guard PTE markers are installed via MADV_GUARD_INSTALL to create lightweight guard regions. Currently, any collapse path (khugepaged or MADV_COLLAPSE) will fail when encountering such a range. MADV_COLLAPSE fails deep inside the collapse logic when trying to swap-in the special marker in __collapse_huge_page_swapin(). hpage_collapse_scan_pmd() `- collapse_huge_page() `- __collapse_huge_page_swapin() -> fails! khugepaged's behavior is slightly different due to its max_ptes_swap limit (default 64). It won't fail as deep, but it will still needlessly scan up to 64 swap entries before bailing out. IMHO, we can and should detect this much earlier. This patch adds a check directly inside the PTE scan loop. If a guard marker is found, the scan is aborted immediately with SCAN_PTE_NON_PRESENT, avoiding wasted work. Suggested-by: Lorenzo Stoakes Signed-off-by: Lance Yang --- mm/khugepaged.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 9ed1af2b5c38..70ebfc7c1f3e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1306,6 +1306,16 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_UFFD_WP; goto out_unmap; } + /* + * Guard PTE markers are installed by + * MADV_GUARD_INSTALL. Any collapse path must + * not touch them, so abort the scan immediately + * if one is found. + */ + if (is_guard_pte_marker(pteval)) { + result = SCAN_PTE_NON_PRESENT; + goto out_unmap; + } continue; } else { result = SCAN_EXCEED_SWAP_PTE; -- 2.49.0