out_of_memory() selects tasks without considering mempolicy. Assuming a cpu-less NUMA Node, ordinary process that don't set mempolicy don't allocate memory from this cpu-less Node, unless other NUMA Nodes are below low watermark. If a task binds to this cpu-less Node and triggers OOM, many tasks may be killed wrongly that don't occupy memory from this Node. To fix it, only kill current if oc->nodemask are all nodes without any cpu. Signed-off-by: Jinjiang Tu --- mm/oom_kill.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 25923cfec9c6..8ae4b2ecfe12 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1100,6 +1100,20 @@ int unregister_oom_notifier(struct notifier_block *nb) } EXPORT_SYMBOL_GPL(unregister_oom_notifier); +static bool should_oom_kill_allocating_task(struct oom_control *oc) +{ + if (sysctl_oom_kill_allocating_task) + return true; + + if (!oc->nodemask) + return false; + + if (nodes_intersects(*oc->nodemask, node_states[N_CPU])) + return false; + + return true; +} + /** * out_of_memory - kill the "best" process when we run out of memory * @oc: pointer to struct oom_control @@ -1151,7 +1165,7 @@ bool out_of_memory(struct oom_control *oc) oc->nodemask = NULL; check_panic_on_oom(oc); - if (!is_memcg_oom(oc) && sysctl_oom_kill_allocating_task && + if (!is_memcg_oom(oc) && should_oom_kill_allocating_task(oc) && current->mm && !oom_unkillable_task(current) && oom_cpuset_eligible(current, oc) && current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) { -- 2.43.0