From: Kaitao Cheng __page_handle_poison() used drain_all_pages() instead of zone_pcp_disable() because dissolve_free_hugetlb_folio() could restore HVO vmemmap pages and decrement hugetlb_optimize_vmemmap_key. That static key update took cpu_hotplug_lock through static_key_slow_dec(), while zone_pcp_disable() holds pcp_batch_high_lock. CPU hotplug takes the locks in the opposite order through page_alloc_cpu_online/dead(), so the combination could deadlock. That dependency no longer exists. Commit da3e2d1ca43d ("mm/hugetlb: remove hugetlb_optimize_vmemmap_key static key") removed the HVO static key and the static_branch_dec() from hugetlb_vmemmap_restore_folio(). The dissolve_free_hugetlb_folio() path no longer reaches static_key_slow_dec(). Use zone_pcp_disable() again while dissolving the hugetlb folio and taking the target page off the buddy allocator. This prevents the drained PCP lists from being refilled before take_page_off_buddy() runs, making the page isolation deterministic. Signed-off-by: Kaitao Cheng --- mm/memory-failure.c | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 866c4428ac7e..b9619d43173b 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -172,23 +172,11 @@ static int __page_handle_poison(struct page *page) { int ret; - /* - * zone_pcp_disable() can't be used here. It will - * hold pcp_batch_high_lock and dissolve_free_hugetlb_folio() might hold - * cpu_hotplug_lock via static_key_slow_dec() when hugetlb vmemmap - * optimization is enabled. This will break current lock dependency - * chain and leads to deadlock. - * Disabling pcp before dissolving the page was a deterministic - * approach because we made sure that those pages cannot end up in any - * PCP list. Draining PCP lists expels those pages to the buddy system, - * but nothing guarantees that those pages do not get back to a PCP - * queue if we need to refill those. - */ + zone_pcp_disable(page_zone(page)); ret = dissolve_free_hugetlb_folio(page_folio(page)); - if (!ret) { - drain_all_pages(page_zone(page)); + if (!ret) ret = take_page_off_buddy(page); - } + zone_pcp_enable(page_zone(page)); return ret; } -- 2.50.1 (Apple Git-155)