For allocations that are of costly order and __GFP_NORETRY (and can perform compaction) we attempt direct compaction first. If that fails, we continue with a single round of direct reclaim+compaction (as for other __GFP_NORETRY allocations, except the compaction is of lower priority), with two exceptions that fail immediately: - __GFP_THISNODE is specified, to prevent zone_reclaim_mode-like behavior for e.g. THP page faults - compaction failed because it was deferred (i.e. has been failing recently so further attempts are not done for a while) or skipped, which means there are insufficient free base pages to defragment to begin with Upon closer inspection, the second condition has a somewhat flawed reasoning. If there are not enough base pages and reclaim could create them, we instead fail. When there are enough base pages and compaction has already ran and failed, we proceed and hope that reclaim and the subsequent compaction attempt will succeed. But it's unclear why they should and whether it will be as inexpensive as intended. It might make therefore more sense to just fail unconditionally after the initial compaction attempt, so do that instead. Costly allocations that do want the reclaim/compaction to happen at least once can omit __GFP_NORETRY, or even specify __GFP_RETRY_MAYFAIL for more than one attempt. There is a slight potential unfairness in that costly __GFP_NORETRY allocations that can't perform direct compaction (i.e. lack __GFP_IO) will still be allowed to direct reclaim, while those that can direct compact will now never attempt direct reclaim. However, in cases of memory pressure causing compaction to be skipped due to insufficient base pages, direct reclaim was already not done before, so there should be no functional regressions from this change. Signed-off-by: Vlastimil Babka --- include/linux/gfp_types.h | 2 ++ mm/page_alloc.c | 47 +++-------------------------------------------- 2 files changed, 5 insertions(+), 44 deletions(-) diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 3de43b12209e..051311fdbdb1 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -218,6 +218,8 @@ enum { * caller must handle the failure which is quite likely to happen under * heavy memory pressure. The flag is suitable when failure can easily be * handled at small cost, such as reduced throughput. + * For costly orders, only memory compaction can be attempted with no reclaim + * under some conditions. * * %__GFP_RETRY_MAYFAIL: The VM implementation will retry memory reclaim * procedures that have previously failed if there is some indication diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e6fd1213328b..2671cbbd6375 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4763,52 +4763,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, goto got_pg; /* - * Checks for costly allocations with __GFP_NORETRY, which - * includes some THP page fault allocations + * Compaction didn't succeed and we were told not to try hard, + * so fail now. */ if (costly_order && (gfp_mask & __GFP_NORETRY)) { - /* - * If allocating entire pageblock(s) and compaction - * failed because all zones are below low watermarks - * or is prohibited because it recently failed at this - * order, fail immediately unless the allocator has - * requested compaction and reclaim retry. - * - * Reclaim is - * - potentially very expensive because zones are far - * below their low watermarks or this is part of very - * bursty high order allocations, - * - not guaranteed to help because isolate_freepages() - * may not iterate over freed pages as part of its - * linear scan, and - * - unlikely to make entire pageblocks free on its - * own. - */ - if (compact_result == COMPACT_SKIPPED || - compact_result == COMPACT_DEFERRED) - goto nopage; - - /* - * THP page faults may attempt local node only first, - * but are then allowed to only compact, not reclaim, - * see alloc_pages_mpol() - * - * compaction can fail for other reasons than those - * checked above and we don't want such THP allocations - * to put reclaim pressure on a single node in a - * situation where other nodes might have plenty of - * available memory - */ - if (gfp_mask & __GFP_THISNODE) - goto nopage; - - /* - * Looks like reclaim/compaction is worth trying, but - * sync compaction could be very expensive, so keep - * using async compaction. - */ - compact_priority = INIT_COMPACT_PRIORITY; - } + goto nopage; } retry: -- 2.52.0