TestSetPageHWPoison() is called without zone->lock, so its atomic update to page->flags can race with non-atomic flag operations that run under zone->lock in the buddy allocator. In particular, __free_pages_prepare() does: page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP; This non-atomic read-modify-write, while correctly excluding __PG_HWPOISON from the mask, can still lose a concurrent TestSetPageHWPoison if the read happens before the poison bit is set and the write happens after. Follow-up patches in this series add similar non-atomic flag operations as well. Fix by acquiring zone->lock around TestSetPageHWPoison. This serializes with all buddy flag manipulation. The cost is negligible: one lock/unlock in an extremely rare path (hardware memory errors). Signed-off-by: Michael S. Tsirkin Assisted-by: Claude:claude-opus-4-6 --- mm/memory-failure.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index ee42d4361309..a6b61172dd13 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2348,6 +2348,8 @@ int memory_failure(unsigned long pfn, int flags) unsigned long page_flags; bool retry = true; int hugetlb = 0; + struct zone *zone; + unsigned long mf_flags; if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); @@ -2390,7 +2392,10 @@ int memory_failure(unsigned long pfn, int flags) if (hugetlb) goto unlock_mutex; + zone = page_zone(p); + spin_lock_irqsave(&zone->lock, mf_flags); if (TestSetPageHWPoison(p)) { + spin_unlock_irqrestore(&zone->lock, mf_flags); res = -EHWPOISON; if (flags & MF_ACTION_REQUIRED) res = kill_accessing_process(current, pfn, flags); @@ -2399,6 +2404,7 @@ int memory_failure(unsigned long pfn, int flags) action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED); goto unlock_mutex; } + spin_unlock_irqrestore(&zone->lock, mf_flags); /* * We need/can do nothing about count=0 pages. -- MST